Loading…
Edinburgh, Scotland, UK
October 21 & October 25 | Co-Located Events, Tutorials, & Workshops
October 22-24 | Conference
Find out more information for Open Source Summit + Embedded Linux Conference & OpenIoT Summit Europe 2018

Please note that you can view and download presentations on the Open Source Summit and Embedded Linux Conference + OpenIoT Summit slides pages. 
Monday, October 22 • 15:05 - 15:45
Streaming Pipelines for Neural Machine Translation - Suneel Marthi, ASF

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Machine Translation is important when having to cater to different geographies and locales for news or eCommerce website content. Machine Translation systems often need to handle a large volume of concurrent translation requests from multiple sources in multiple languages. They have to do this in real time while making efficient use of specialized hardware.

Many Machine Translation preprocessing tasks like Text Normalization, Language Detection, Sentence Segmentation etc. can be performed at scale in a real time streaming pipeline utilizing Apache Flink. We will be looking at a few such streaming pipelines leveraging Apache OpenNLP components. These components will preprocess data into a format that can be consumed by a Neural Machine Translation library.

We'll demonstrate and examine the end-to-end throughput and latency of a pipeline that detects language and translates news articles shared via twitter in real-time. Developers will come away with a better understanding of how Neural Machine Translation works, how to build pipelines for machine translation preprocessing tasks and Neural Machine Translation models, and have access to a demo repository to experiment with and build machine translation models themselves.

Speakers
avatar for Suneel Marthi

Suneel Marthi

AWS
Suneel is a Member of Apache Software Foundation and is a Committer and PMC on Apache Mahout, Apache OpenNLP, Apache Streams. He's presented in the past at Flink Forward, Hadoop Summit, Berlin Buzzwords, Machine Learning Conference, Big Data Tech Warsaw and Apache Big Data.


Monday October 22, 2018 15:05 - 15:45 BST
Fintry Auditorium, Level 3