Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Raw data

From Wikipedia, the free encyclopedia
"Primary data" redirects here. For data that has been created at the time under study, seePrimary source.
Collection of information that has not been fully processed or analyzed

icon
This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Raw data" – news ·newspapers ·books ·scholar ·JSTOR
(December 2009) (Learn how and when to remove this message)
Look upraw data in Wiktionary, the free dictionary.
The two columns to the right of the left-most column in this computerized table are raw data.

Raw data, also known asprimary data, aredata (e.g., numbers, instrument readings, figures, etc.) collected from a source. In the context of examinations, the raw data might be described as araw score (aftertest scores).

If a scientist sets up a computerizedthermometer which records the temperature of a chemical mixture in a test tube every minute, the list of temperature readings for every minute, as printed out on a spreadsheet or viewed on a computer screen are "raw data". Raw data have not been subjected to processing, "cleaning" by researchers to removeoutliers, obvious instrument reading errors or data entry errors, or any analysis (e.g., determiningcentral tendency aspects such as theaverage ormedian result). As well, raw data have not been subject to any other manipulation by a software program or a human researcher, analyst or technician. They are also referred to asprimary data. Raw data is a relative term (seedata), because even once raw data have been "cleaned" and processed by one team of researchers, another team may consider these processed data to be "raw data" for another stage of research. Raw data can be inputted to acomputer program or used in manual procedures such as analyzingstatistics from asurvey. The term "raw data" can refer to thebinary data on electronic storage devices, such as hard disk drives (also referred to as "low-level data").

Generating data

[edit]

Data has two ways of being created or made. The first is what is called 'captured data',[1] and is found through purposeful investigation or analysis. The second is called 'exhaust data',[1] and is gathered usually by machines or terminals as a secondary function. For example, cash registers, smartphones, andspeedometers serve a main function but may collect data as a secondary task. Exhaust data is usually too large or of little use to process and becomes 'transient' or thrown away.[1]

Examples

[edit]

Incomputing, raw data may have the following attributes: it may possibly contain human, machine, or instrument errors, it may not be validated; it might be in different area (colloquial) formats;uncoded or unformatted; or some entries might be "suspect" (e.g.,outliers), requiringconfirmation orcitation. For example, a data input sheet might contain dates as raw data in many forms: "31st January 1999", "31/01/1999", "31/1/99", "31 Jan", or "today". Once captured, this raw data may beprocessed stored as a normalized format, perhaps aJulian date, to make it easier for computers and humans to interpret during later processing. Raw data (sometimes colloquially called "sources" data or "eggy" data, the latter a reference to the data being "uncooked", that is, "unprocessed", like a rawegg) are the data input to processing. A distinction is made betweendata andinformation, to the effect that information is theend product ofdata processing. Raw data that has undergone processing are sometimes referred to as "cooked" data in a colloquial sense.[dubiousdiscuss] Although raw data has the potential to be transformed into "information," extraction, organization, analysis, and formatting for presentation are required before raw data can be transformed into usable information.

For example, apoint-of-sale terminal (POS terminal, a computerizedcash register) in a busy supermarket collects huge volumes of raw data each day about customers' purchases. However, this list of grocery items and their prices and the time and date of purchase does not yield much information until it is processed. Once processed and analyzed by asoftware program or even by a researcher using a pen and paper and acalculator, this raw data may indicate the particular items that each customer buys, when they buy them, and at what price; as well, an analyst or manager could calculate the average total sales per customer or the average expenditure per day of the week by hour. This processed and analyzed data provides information for the manager, that the manager could then use to help her determine, for example, how many cashiers to hire and at what times. Suchinformation could then becomedata for further processing, for example as part of a predictivemarketing campaign. As a result of processing, raw data sometimes ends up being put in adatabase, which enables the raw data to become accessible for further processing and analysis in any number of different ways.

Tim Berners-Lee (inventor of theWorld Wide Web) argues that sharing raw data is important for society.Inspired bya post byRufus Pollock of theOpen Knowledge Foundation his call to action is"Raw Data Now" , meaning that everyone should demand that governments and businesses share the data they collect as raw data. He points out that "data drives a huge amount of what happens in our lives… because somebody takes the data and does something with it." To Berners-Lee, it is essentially from this sharing of raw data, that advances in science will emerge. Advocates ofopen data argue that once citizens and civil society organizations have access to data from businesses and governments, it will enable citizens and NGOs to do theirown analysis of the data, which can empower people and civil society. For example, a government may claim that its policies are reducing theunemployment rate, but apoverty advocacy group may be able to have its staffeconometricians do their own analysis of the raw data, which may lead this group to draw different conclusions about the data set.

Critiques of raw data

[edit]

Critical data studies scholars have critiqued the termraw data.[2][3] The critique stems from the idea that data can never be raw, instead data are always constructed and shaped by the decisions of people. Humanities scholarJohanna Drucker has argued that data are "capta, taken and constructed".[4] As an example, when data from a thermometer or other instrument is generated, the data is shaped by the configurations specific to the design of the instrument.

Distinction between raw and processed data

[edit]

Raw data, often calledprimary data, is the unprocessed and original form of data collected directly from sources or instruments. It may contain errors, inconsistencies, or redundant information and typically requires processing steps such as cleaning, validation, and structuring to become usable.Processed data results from transforming raw data into an organized and interpretable format, making it suitable for analysis, visualization, and decision-making. The flexibility and completeness of raw data enable comprehensive and diverse analyses, while processed data serves as the practical basis for generating actionable insights. Because raw data preserves the fullest detail, it is invaluable in scientific research andmachine learning, where high-quality inputs are critical for accuracy of conclusions and model training.

See also

[edit]

References

[edit]
  1. ^abcKitchin, Rob (2014).The Data Revolution. United States: Sage. p. 6.
  2. ^Gitelman, Lisa (2013).Raw data is an oxymoron. MIT press.
  3. ^Loukissas, Yanni Alexander (2019).All data are local: Thinking critically in a data-driven society. MIT press.
  4. ^Dricker, Johanna (2011)."Humanities Approaches to Graphical Display".Digital Humanities Quarterly.5 (1).

Further reading

[edit]
Retrieved from "https://en.wikipedia.org/w/index.php?title=Raw_data&oldid=1335656489"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp