The world of data at Gojek


By Adrien Chenailler

At Gojek, the data world has been split between Analysts and Data Scientists. Analysts excel at analysing data, making sense of it, and driving critical business and product decisions that help us identify what makes us thrive and or what makes us irrelevant. On the other hand, Data Scientists infuse our Super App with machine learning, revolutionising the way we price, match, recommend, or fight fraud in the app, real-time.

Gojek has been a data-driven business since the very beginning and we grew along this axis — storing, organising, and utilising our data to power our growth and improve the Super App.

We’ve spent tremendous amount of time automating and building tools to ensure easy access to data in a centralised, organised manner that enable us to make informed decisions.

Understanding our users and how they use our products is important. Our Analysts build complex text understanding models to provide state of the art insights, and come up with data models and great visualisation for them.

We also needed experts who not only looked at data in a ‘problem solving’ perspective, but also in a ‘decision making’ angle.

Introducing ‘Decision Scientists’

Decision scientists are experts at making ambiguous decisions using models. Essentially, a decision scientist uses machine learning, among other statistical techniques, to tell us whether we should steer left or right.

The skills

Business

Understanding and modeling the business is key to this role. Decision scientists bring deeper insights into business and product decisions. They need to fully understand how the business works and where data science could be impactful.

Ambiguity and non-triviality

User behaviour (or human behaviour in general) is ambiguous and non-trivial. The ability to model this and avoid generalisation is key.

Behavioural and causal modeling

Decision scientists build a custom methodology to infer causation and enable more precise decisions. They don’t stop at correlation models.

Advanced statistics

Decision scientists need to tackle more complex problems in places where statistical relationships may be weak or hidden behind multiple confounding variables. Statistical rigour is an essential quality.

Modeling

Modeling and machine learning are still essential and focus on the understanding and explainability of the decision rather than the deployment of a real-time system.

But… what type of projects do Decision Scientists work on?

Take this project, for example:

To quantify the impact of customer experience on core business metrics, most data-savvy people will build a simple model with metrics like number of app crashes, time to booking, etc. as a variable. However, this type of problem is a lot more complex than it seems.

  • First, the business metric is unknown and needs to be defined. Would it be a long or short term impact? A small change in churn rate, for instance, that will deteriorate and be very costly in the long term while weekly completed order which a more immediate action.
  • Second, would a longer booking time impact the business metric? Imagine we have an average time to booking of 120 seconds. Does decreasing this time improve business metrics?
  • Third, the problem is deeply more complicated given the number of confounding variables. You can expect that non-customer experience variables such as price will have a more direct impact. Therefore, here comes the statistical challenge to infer causality on second-order variables.

What makes this problem a true decision science? Well, the key here is the understanding of the mix of discipline:

Causal statistics (how to infer causality rather than correlation)
Human behaviour (which trigger impacts humans immediately and which one does it later)
Business sense

As you see, the definition is not straightforward. Even the target variable may contain a lot of measurement error.

A decision scientist excels at navigating uncertainty and ambiguity.

What are the key differences with other roles at Gojek?

The major difference is the use of the final product each role delivers. Although the boundary is porous, we have 4 purposes for our data products:

👤 Data for people → Enable easy access to data to make decisions from it

⚙️ Data for machines → Enable access to real-time and batch data

👤 Data science for people → Enable insights and model-driven decisions

⚙️ Data science for machines → Enable microdecisions in the app

At Gojek, our talented data team wears multiple hats to solves problems. As we scale, our problems get more complex, our skills requirements increase.

To help you understand the roles better, here’s a break down of how we bifurcate the roles of an analyst, data scientist and decision scientist, and hire folks with specialised skills:

Oh, did I tell you we’re hiring Decision Scientists right now? Explore all open job positions here!

Click here for more stories on how we build our Gojek #SuperApp. 💚