toggle

Glossary

datakulture has compiled a glossary of terms and definitions that every data-driven professional needs to know to navigate the evolving landscape of data science and analytics.

Glossary Logo
All
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z

A2

Anomaly detection

In data perspective, any data point that deviates from the rest of the data is called anomaly. Anomaly detection is finding out these unusual patterns and deviations in data that don’t confide to the norm.

ACID property

ACID denotes the characteristics of database transactions, and ACID properties full form are atomicity, consistency, isolation, and durability. Every database must have these properties to ensure reliability and completeness of each transaction.

B1

Batch processing

Batch processing is a data movement term that denotes the collection, storage, and processing of data in batches over a time period. It’s the execution of a series of tasks in batches at scheduled times without manual intervention. 

C2

Cloud data warehouse

Cloud data warehouse is a storage repository which stores your organizational data in the cloud rather than on-premise infrastructure. Cloud data warehousing offers flexible, scalable, and cost-effective solutions for data storage, querying, and analytics.

Customer support KPIs

Customer support KPIs are measurable values to assess the effectiveness of your customer support team.

D11

Data transformation

Data transformation is changing data from one format, structure, or type to another to make it suitable for analysis. Think of this as prepping ingredients before starting to cook.

Data processing

Data processing is the series of steps involved in collecting, transforming, and organizing raw data into meaningful insights.

Data cleansing

Data cleansing is a data quality improvement process that involves removing duplicates, errors, and inconsistencies to make it more usable.

Data discovery

Data discovery is the process of collecting data from various sources, categorizing them, and preparing them for analysis.

Data swamp

Data swamping refers to an unorganized data repository that is flooded with massive amounts of data, making it challenging to derive meaningful insights.

Data anonymization

Data anonymization is a data protection process that hides sensitive and identifiable information with random characters before sharing the data for analysis or research purposes.

Data partitioning

Data partitioning is a database management process about splitting large datasets into manageable portions called partitions.

Data masking

Data masking is a security measure to hide your data by replacing it with dummy characters.

Data lineage

Data lineage is a data management practice that explains the entire lifecycle of the data from how it's created to how it's transformed until it reaches its destination.

Data fabric

Data fabric is a modern data management architecture. It integrates and unifies data management, governance, and security across hybrid environments.

Data mart

A data mart is a storage repository focused on a specific line of business, e.g. a marketing data mart that stores centralized data of the marketing department. This way, marketing users don’t have to search the organization-wide database to get insights.

E2

ETL

ETL (extract, transform, and load) is about integrating data from various sources, transforming them into a standard format, and loading into the destination.

eCommerce KPIs

eCommerce KPIs are the performance indicators, denoting customer behavior, growth, and profitability of an eCommerce business.

F1

Finance KPIs

Finance KPIs are values that can help you estimate how well your company is achieving your financial goals. It gives you a peek into the financial health, assets and expenses, and cash flow of past and present.

H1

HR KPIs

HR KPIs are useful, relevant, and measurable metrics for HR teams to assess the performance of human capital and resources. It’s an indicator of the impact of workforce management, employee engagement, hiring, retention, and more.

L1

Legacy systems

Legacy system is an outdated piece of software, tool, or technology that’s still in use, even though there are many modern alternatives available. In simpler terms legacy system definition means old computer or hardware programs built with old and obsolete tech stack.

M3

Master data management

Master data is the most important data assets of an organization, like customer data, product data, etc. The intricate process of managing, organizing the organization’s critical data assets in a clear, consistent, and accurate manner is master data management.

Marketing KPIs

Marketing teams take up so many roles wisely - SEO, paid channels, advertising, content creation, and more. How do they know how well these efforts turn out? Marketing KPIs can answer that.

Metadata management

Metadata management is the process of organizing, managing, and updating metadata in a central repository.

S2

Serverless architecture

Serverless architecture is a cloud resources management concept where the user only focuses on running applications, while the cloud provider manages the dynamic infrastructure needs.

Sales KPIs

Sales KPIs are the measure of how well a sales team is performing and the impact of this performance on a company's financial health. By measuring sales KPIs, a company or a sales head can understand the effectiveness of their sales strategies and sharpen them further, addressing the loopholes.