Data extraction is the act or process of retrieving data out of data sources for further data processing or data storage (data migration). The import into the intermediate extracting system is thus usually followed by data transformation and possibly the addition of metadata prior to export to another stage in the data workflow. Extracting data from web pages has grown into a considerable technical challenge and is referred to as Web scraping.
Publishers have access to raw data from a variety of sources. They have data from first- and second-party sources, like registration data, web analytics and ad server data.
These separate data sources on their own may provide value, but without organization, the data cannot be used to its full potential. Analytics looks to solve this problem by connecting data sources, adding structure to the data based on the specific business rules and helping make the data actionable. This also provides deeper insights into their audience and helps make better business decisions.