10 Years of Automated Category Classification for Product Data
Johannes Knopp
Artificial Intelligence, Deep Learning, Data Science, Infrastructure, Machine Learning, Data Engineering

10 years ago we built a classifier for categorizing product data. Let's take a journey through the lessons we learned over the years about building, maintaining, and modernizing the category classifier.

Airflow: your ally for automating machine learning and data pipelines
Enrica Pasqua, Bahadir Uyarer
Big Data, Infrastructure, Machine Learning, Data Engineering

Automate your machine learning and data pipelines with Apache Airflow

Automated Feature Engineering and Selection in Python
Franziska Horn
Data Science, Machine Learning, Science, Data Engineering, Statistics

Automated feature engineering and selection in Python with the autofeat library.

Automating feature engineering for supervised learning? Methods, open-source tools and prospects.
Thorben Jensen
Artificial Intelligence, Algorithms, Data Science, Machine Learning, Data Engineering

How to automate the labor-intensive task of feature engineering for Machine Learning? This talk gives an overview on methods, presents open-source libraries for Python, and compares their performance.

Decentralized and Privacy-Preserving ML via TensorFlow Federated
Peter Kairouz, Amlan Chakraborty
Artificial Intelligence, Deep Learning, Data Science, Machine Learning, Data Engineering

Meet TensorFlow Federated: an open-source framework for machine learning and other computations on decentralized data.

Fighting fraud: finding duplicates at scale
Alexey Grigorev
Data Science, Infrastructure, Machine Learning, Data Engineering

Fight fraudsters at scale: use machine learning to find duplicates in 10 million ads daily

Introduction to automated testing with pytest
Raphael Pierzina
DevOps, Web, Data Engineering

Learn how to get started with developing automated tests in Python with the pytest test framework!

Kartothek – Table management for cloud object stores powered by Apache Arrow and Dask
Florian Jetter
Big Data, Data Engineering

Kartothek - Table management for cloud object stores powered by @ApacheArrow and @dask_dev

Managing the end-to-end machine learning lifecycle with MLFlow
Tobias Sterbak
Data Science, Infrastructure, Machine Learning, Data Engineering

How to manage the end-to-end machine learning lifecycle with MLflow.

Mock Hell
Edwin Jung
Code-Review, Web, Data Engineering

Mock Hell: How to escape and avoid it, and improve your design in the process.

Production-level data pipelines that make everyone happy using Kedro
Yetunde Dada
Data Science, DevOps, Machine Learning, Data Engineering

Learn how easy it is to apply software engineering principles to your data science and data engineering code. Expect an overview of Kedro, a library that implements best practices for data pipelines with an eye towards productionizing ML models.

Transforming a Legacy System into a Bias-Mitigating AI Solution for Debt Repayment
Avaré Stewart
Artificial Intelligence, Data Science, Natural Language Processing, Machine Learning, Data Engineering

Unleash Intelligence in you Data Transform a Legacy System into Bias-Mitigating AI Solution for Debt Repayment with Tesseract, SpaCy, & AI Fairness 360

What we learned from scraping 1 billion webpages every month
Samet Atdag
Business & Start-Ups, Big Data, Infrastructure, Web, Data Engineering

We broke the web via simple hacks. Instead of order, we caused chaos. How to fix that?

🌈Apache Airflow for beginners
Varya
Infrastructure, Data Engineering

Airflow can sound more complicated than it is. Learn the basics on the practical example.

Filter