Created by Matthew Evans
almost 2 years ago
|
||
Question | Answer |
Big Data | Datasets that are too large and complex for businesses' existing systems to handle utilizing their traditional capabilities to capture, store, manage, and analyze these datasets. |
Classification | A data approach |
Big Data | Datasets that are too large and complex for businesses’ existing systems to handle utilizing their traditional capabilities to capture, store, manage, and analyze these datasets. |
Classification | A data approach that attempts to assign each unit in a population into a few categories potentially to help with predictions. |
Clustering | A data approach that attempts to divide individuals (like customers) into groups (or clusters) in a useful or meaningful way. |
Co-Occurrence Grouping | A data approach that attempts to discover associations between individuals based on transactions involving them. |
Data Analytics | The process of evaluating data with the purpose of drawing conclusions to address business questions. Indeed, effective Data Analytics provides a way to search through large structured and unstructured data to identify unknown patterns or relationships. |
Data Dictionary | Centralized repository of descriptions for all of the data attributes of the dataset. |
Data Reduction | A data approach that attempts to reduce the amount of information that needs to be considered to focus on the most critical items (e.g., highest cost, highest risk, largest impact, etc.). |
Link Prediction | A data approach that attempts to predict a relationship between two data items. |
Predictor (or Independent or Explanatory) Variable | A variable that predicts or explains another variable, typically called a predictor or independent variable. |
Profiling | A data approach that attempts to characterize the “typical” behavior of an individual, group, or population by generating summary statistics about the data (including mean, standard deviations, etc.). |
Regression | A data approach that attempts to estimate or predict, for each unit, the numerical value of some variable using some type of statistical model. |
Response (or Dependent) Variable | A variable that responds to, or is dependent on, another. |
Similarity Matching | A data approach that attempts to identify similar individuals based on data known about them. |
Structured Data | Data that are organized and reside in a fixed field with a record or a file. Such data are generally contained in a relational database or spreadsheet and are readily searchable by search algorithms. |
Unstructured Data | Data that do not adhere to a predefined data model in a tabular format. |
Want to create your own Flashcards for free with GoConqr? Learn more.