What is data ingestion and why is it important?
Data ingestion is a simple term for a complex set of actions. It is more than just collecting data from various sources — it includes normalizing, formatting and correlating differences into a single coherent data set.
Cloud Service Providers (CSPs) are constantly evolving their cloud cost and usage data, what information is provided and how you access it. For example, the AWS Cost and Usage Report (CUR) can contain millions of rows and easily exceed the capacity of most desktop spreadsheets. When a new service is introduced, CUR fields can be added or formats changed. When you are using multiple CSPs, keeping up with the different formats and changes is a daunting task.
Adding to the complexity of the CSP-provided data, your business likely has important metadata to add context, such as business unit, cost center, owner, etc. Your metadata is a constantly evolving landscape as well.