Data Engineer
San Francisco
Speak
Speak is the first & only app that lets you get real conversational practice without needing a live tutor on the other end. And we build some serious AI tech to make that possible.Our mission is to become the de facto way people learn foreign languages. We begin by teaching the next billion people English and Spanish.
English is the global language of business, culture, and communication, and over 1.5 billion people around the world are actively trying to learn right now. Others dream of communicating with the half-billion native Spanish speakers across the globe. The problem is that it's nearly impossible to learn to speak a language without constant access to a speaking partner. Grammar and vocab apps don't really help – you need to actually converse with someone.
Speak is on a journey to fix this. We're creating an AI-powered experience that replicates the flow of a conversation, without needing a human on the other end. The goal is to make it radically more accessible to be able to have conversations in a foreign language and eventually help hundreds of millions of people gain fluency who otherwise wouldn't be able to.
We started on this journey over five years ago and we've still got a long ways to go. We're thoughtfully adding new team members only when we think they can truly play a big role in our mission.
Speak launched first in South Korea where we have quickly grown to become the top grossing education app in the country. We have now delivered this winning product to more than 30 countries globally and are continuing to expand to more markets in the coming months. The company is well funded, raising a recent Series B backed by investors like OpenAI, Founders Fund, Y Combinator, Khosla Ventures, Lachy Groom, Josh Buckley, and others. We’re a team of 60 based primarily in SF, Seoul, Tokyo, and Ljubljana.
About this roleAs an Data Engineer at Speak, you'll play a pivotal role in shaping the future of digital language learning, propelling us towards our mission of making language proficiency accessible to millions worldwide.
Your responsibilities will span the crucial intersection of data infrastructure and analytics, from managing scalable data pipelines that support real-time processing, to deploying sophisticated analytics solutions that drive personalized learning experiences. You'll work closely with our product, engineering, and data science teams to ensure that our platform, powered by cutting-edge technology, is not only robust but also delivers actionable insights that enhance user engagement and learning outcomes.
What you'll be doingDesign and Build Data Infrastructure: You'll architect and implement robust, scalable data pipelines that ensure efficient data flow and processing, supporting both real-time analytics and large-scale, batch processing needs. Your work will be critical in managing the ingestion, storage, and accessibility of data from various sources, ensuring our platform's backbone is strong and reliable.
Enable Data-Driven Decisions: By collaborating with cross-functional teams, you will develop and deploy tools and frameworks that facilitate data access and analysis, empowering product and business teams to make informed decisions. This includes creating dashboards, reports, and advanced analytics models that reveal user behavior patterns, learning efficacy, and opportunities for product improvement.
Optimize Data Architecture: Constantly evaluate and refine the data architecture to support our growing data needs and ensure optimal performance. This includes managing data lakes, data warehouses, and databases, as well as implementing best practices for data modeling, data quality, and data governance.
Support Machine Learning Projects: Work closely with data scientists and machine learning engineers by providing them with clean, structured data for building and deploying predictive models that enhance personalized learning experiences and engagement strategies.
Innovate and Experiment: Stay ahead of the curve by researching and implementing cutting-edge technologies and methodologies in data engineering and analytics.
Collaborate Across Teams: As a key player in the engineering team, you'll work closely with product managers, analysts, and other engineers to bring data-driven products and features from concept to launch.
Data Modeling: Deep understanding of data structures, theories, principles, and practices. Ability to design, implement, and manage data warehouses effectively.
Big Data Technologies: Proficiency in big data technologies and frameworks such as DBT, Airflow, etc., to handle large-scale data processing and analysis.
Programming Skills: Strong programming skills in languages relevant to data engineering (Python and SQL). Ability to write efficient, reliable, and maintainable code.
Data Pipeline and ETL Development: Experience in building and optimizing data pipelines, architectures, and data sets. Familiarity with ETL (extract, transform, load) processes and tools.
Cloud Computing: Knowledge of cloud services (GCP BigQuery, dbt) and understanding of how to leverage them for data processing and storage solutions.
Data Analysis and Visualization: Ability to analyze data to identify patterns, anomalies, and insights. Proficiency in using data visualization tools(eg Mode) to communicate findings clearly.
Debugging Skills: Strong problem-solving skills and the ability to approach complex challenges methodically including data inconsistency issues.
Effective Communication: Ability to communicate technical information to non-technical stakeholders clearly and effectively. This includes writing documentation, presenting findings, and collaborating on projects.
Bonuses: Experience with AppsFlyer, Segment, Split.io, Customer.io, Facebook Ads and Google Big Query
Why work at SpeakJoin a fantastic, tight-knit team at the right time: we're growing very quickly, we've raised our Series B and an additional extension from some of the top investors in the valley, and we've achieved product-market fit in our initial markets. You'd join at a magical time when a single person could significantly change the course of the company.
Do your life's work with people you’ll love working with: we care strongly about our craft and want every person at Speak to feel like they're growing every day. We believe in the idea that working with people you both enjoy and have respect for makes everything better. We hire thoughtfully and only work with people we admire deeply.
Global in nature: We're live in over 40 countries and launching in a number of new markets soon. We have dedicated offices in San Francisco, Ljubljana, Seoul, and Tokyo, and you’ll have the opportunity to talk to users in each of these regions on a regular basis as well as travel.
Impact people's lives in a major way: Learning a language is one of the single most life-changing skills one can learn, and right now 99% of people never achieve their goal because the process is broken. We’re helping millions of people achieve their goals and improve their lives.
Speak does not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics.
Tags: Airflow Architecture Big Data BigQuery Data analysis Data governance Data pipelines Data quality Data visualization dbt Engineering ETL GCP Machine Learning OpenAI Pipelines Python SQL Teaching
Perks/benefits: Career development
More jobs like this
Explore more AI, ML, Data Science career opportunities
Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.
- Open Business Intelligence Engineer jobs
- Open Lead Data Analyst jobs
- Open Power BI Developer jobs
- Open Data Engineer II jobs
- Open Senior Business Intelligence Analyst jobs
- Open Marketing Data Analyst jobs
- Open Data Science Manager jobs
- Open MLOps Engineer jobs
- Open Junior Data Scientist jobs
- Open Business Intelligence Developer jobs
- Open Business Data Analyst jobs
- Open Data Scientist II jobs
- Open Product Data Analyst jobs
- Open Data Analytics Engineer jobs
- Open Data Analyst Intern jobs
- Open Sr Data Engineer jobs
- Open Principal Data Scientist jobs
- Open Sr. Data Scientist jobs
- Open Senior Data Architect jobs
- Open Data Engineering Manager jobs
- Open Junior Data Engineer jobs
- Open Big Data Engineer jobs
- Open Research Scientist jobs
- Open Data Quality Analyst jobs
- Open Azure Data Engineer jobs
- Open GCP-related jobs
- Open Java-related jobs
- Open Data quality-related jobs
- Open ML models-related jobs
- Open Business Intelligence-related jobs
- Open Data management-related jobs
- Open Privacy-related jobs
- Open PhD-related jobs
- Open Data visualization-related jobs
- Open Deep Learning-related jobs
- Open Finance-related jobs
- Open NLP-related jobs
- Open PyTorch-related jobs
- Open TensorFlow-related jobs
- Open LLMs-related jobs
- Open APIs-related jobs
- Open Generative AI-related jobs
- Open CI/CD-related jobs
- Open Snowflake-related jobs
- Open Consulting-related jobs
- Open Kubernetes-related jobs
- Open Hadoop-related jobs
- Open Data governance-related jobs
- Open Databricks-related jobs
- Open Airflow-related jobs