Nishant R.

"Of the 15 engineers on my team, a third are from BairesDev"Nishant R. - Pinterest

From Data to Stories: How to Speak Data Science

Data is huge right now, and most companies are trying to become data driven, but there is still a disconnect between analysts and decision-makers. How can we close that breach to make the best of our data?

Technology
13 min read

Data science is a field of study that combines computer science, statistics, and mathematics to extract knowledge and insights from data. It is a relatively new field that has emerged in the past few years as businesses have become increasingly reliant on data to make decisions. 

Data scientists use a variety of techniques to clean, analyze, and visualize data. They then use their findings to build models that can be used to make predictions or recommendations.

Data science is used in many different industries, including healthcare, finance, retail, and manufacturing. The skills required for data science are constantly evolving as new technologies emerge. However, some essential skills include programming languages (such as R and Python), statistical analysis, machine learning, and deep learning.

But perhaps the most important skill in a data scientist’s toolbox is communication. Data scientists must be able to clearly explain their findings to nontechnical stakeholders. They also need to be able to work with other teams, such as product managers and engineers, to implement their recommendations.

I had a great teacher back in college who used to tell us that math was the easy part of being a mathematician; learning to talk in such a way that people could understand what you were doing with numbers was the complicated part. And without a doubt he was right. Even the most creative algorithm is worthless if people can’t understand what to do with it.

This is why today I want to talk about how to communicate as a data scientist and how to communicate to a data scientist. Creating a communication framework is extremely important for reaching an agreement with people from all walks of life. As such here are 7 simple guidelines that will help you make the best of your data. 

Be Data Driven

In business, the term “data driven” is used a lot. But what does it actually mean to be data driven? Simply put, being data driven means making decisions based on data rather than intuition or gut feeling. Of course, this doesn’t mean that you should ignore your instincts altogether – after all, they’re usually based on years of experience. 

But when it comes to making decisions, you should always let the data guide you. There are a few reasons why being data driven is so important. First of all, it allows you to make better informed decisions. When you have hard evidence to back up your choices, you can be confident that they’re the right ones. 

Secondly, it helps you avoid bias and personal preferences when making decisions – again leading to better choices overall. Finally, basing your decisions on data means that you can constantly test and improve them over time, which would be much harder to do if you were relying solely on intuition and guesswork. 

Unfortunately, that’s easier said than done. To become data driven, you need to have data that is accurate, timely, and relevant. This data can come from a variety of sources, including surveys, customer feedback forms, social media posts, and website analytics. Once you have this data, you need to analyze it to identify trends and patterns.

This analysis will help you make better decisions about your business strategy and operations. To be truly data driven, you need to be constantly collecting and analyzing data. Make this cycle part of your business culture: educate your team on the importance of data, and promote workshops that will help everyone understand the basics of data and data analysis. 

Data analysis should never be “the weird thing the data scientists do in that corner.” It’s extremely hard to take data seriously if it’s siloed away from the rest of the company.

Don’t Say Numbers, Tell a Story

In data science, storytelling is the process of using data to tell a story. This can be done through visualizations, reports, or even just by talking about the data. The goal is to communicate the findings in a way that is easy to understand and makes sense to the audience. 

There are many different ways to tell a story with data. For example: 

  • A data scientist can use data to tell a story about how a company is doing or how a product is selling. 
  • A data analyst can use data to tell a story about customer behavior. 
  • A statistician can use data to tell a story about trends in the world. 

The important thing is that the story is based on facts and evidence from the data. This makes it more convincing and believable, and the story is clear and concise. It should also be interesting and informative, without being too technical. Data science storytelling can be used to explain complex concepts, show trends over time, or even just entertain an audience.

Data storytelling is an important skill for any data scientist because it allows them to communicate their findings in a way that everyone can understand. But how do you tell a story? Here are some tips: 

  1. Start with a strong opening. This will grab the listener’s attention and set the tone for the rest of the story.
  2. Make sure your story has a clear beginning, middle, and end. This will help keep the listener engaged and make your story flow smoothly.
  3. Use descriptive language to paint a picture in the listener’s mind. The more vivid your description, the more likely it is that they will be able to visualize what you’re talking about.
  4. Use body language and facial expressions to emphasize points in your story. This will help bring the story to life and make it more engaging for the listener. 
  5. Be aware of your audience and tailor your story accordingly.

Wait, you might say, isn’t this just basic advice on how to tell a story? Yes, of course, there is a reason why we call this storytelling. The only difference is that the basis for your story is data, not creative imagery or folktales.

Show, Don’t Tell

Data visualization is one of the most important tools in a data analyst’s tool kit. It allows analysts to take large, complex data sets and turn them into easy-to-understand visualizations that can be used to inform decision-making. Data visualization is important not only  for analysts, but for anyone who needs to make sense of data. 

In our increasingly data-driven world, the ability to quickly understand and communicate information is becoming more and more valuable. There are many different ways to visualize data, and the best approach depends on the type of data being analyzed and the goals of the person doing the analysis. 

However, there are some general principles that all good visualizations should follow: 

  • Use clear and concise labels. The goal of a visualization is to communicate information as clearly as possible. This means using labels that are easy to understand and unambiguous. 
  • Use effective visuals. Different types of visuals (e.g., bar charts, line graphs, scatter plots) convey different types of information effectively. Choose the right type of visual for your data, and your message will be much clearer. 
  • Keep it simple. Don’t try to cram too much information into one visualization – less is often more when it comes to effective communication through visuals. Remember the old 10-20-30 rule: 10 slides for 20 minutes at font 30. Be assertive and focus on what’s really important. 

Many data scientists make the huge mistake of thinking that people will quickly grasp even the most basic statistical concepts, but that’s wishful thinking. I can assure anyone out there that most people will have a hard time with something as simple as a correlation matrix until it finally clicks.

Now contrast a correlation matrix with a bunch of numbers to a heat map. It’s exactly the same structure, but the colors make the heat map extremely intuitive; people will be drawn to the red squares and understand that there is something important there.

Additionally, to paraphrase one of Python’s core principles, explicit is better than implicit. No matter how obvious something is, always assume that people will not know about it, and it’s better to over-explain than to present vague or incomplete information. 

Trust the Technology

Machine learning is a complicated process because it involves a lot of math and statistics. To understand how machine learning works, you need to be familiar with concepts like linear algebra, calculus, and probability. For example, with neural networks, we don’t know how the math is working inside the hidden layer. We just know that it’s doing something to transform the input into a useful representation of the output. This is why neural networks are considered a black box model.

Other examples of black box models include support vector machines and random forests. These models are difficult to interpret because there is no clear way to see how the model is making predictions. This can be a problem when you’re trying to understand why the model is making certain decisions.

In my experience, when you tell a decision-maker that you don’t know why an algorithm is making a decision, you’ll find resistance. People don’t like uncertainty, but unfortunately, that’s the way some of these powerful models work. 

Take for example either GPT-3 or DALL-E, two of OpenAI’s most advanced models for text generation and image generation. Why do they choose what they choose? Nobody knows, so what warranty do we have that they aren’t going to make a mistake? In theory none, but in practice, we have seen them pull off some amazing feats time and again.

Reliability is key here. When a model keeps making the right decisions, it is hard to argue against it. And that’s why, even if we don’t have a clear picture of the decision-making process, if the model has proven itself in the past, you should at least give it the benefit of the doubt.

On the other hand, a very convincing argument in favor of listening to an AI is that a massive amount of data is used to train it. To use a previous example, GPT-3 has been trained on 45 terabytes of text. That’s more books and information than a single human being can consume in their lifetime. 

Additionally, here are a few ways you can help people learn about AI and accept its validity: 

  1. Explain what AI is in simple terms. 
  2. Share examples of how AI is being used today. 
  3. Discuss the potential implications of AI for your business. 
  4. Help people understand the ethical boundaries around AI development and the protocols within your company.
  5. Offer resources for further learning about AI.
  6. Offer workshops, and teach simple courses where people can train their own AIs.

Remember the KISS Principle

The key to successful data science is to keep it simple. The KISS principle (keep it simple, stupid) applies here more than ever. Trying to do too much with data can actually lead to bad decision-making. Instead, businesses should focus on a few key objectives and use data to help them achieve those goals. By keeping things simple and focused, businesses can make the most of their data and ensure that it leads to better decision-making. 

For example, don’t use a complex model when a good old-fashioned regression will do the job. I am as eager as the next person to build a deep learning model and see what it can predict, but doing it when you only have two predictive variables is like killing a mosquito with a nuclear strike. You are wasting time and resources and adding a layer of complexity that isn’t required.

Data Isn’t Everything

Data is important, but it’s not the only thing that matters. Sometimes, data can actually lead us astray. We’ve all seen examples of this in the news, where a company relies too heavily on data and ends up making a bad decision. In some cases, this can be because the data is simply wrong or incomplete. In other cases, it may be because the company didn’t take into account other important factors that weren’t captured by the data. 

Whatever the reason, businesses need to remember that data isn’t everything and that they need to use their judgment when making decisions based on it. Likewise, both data scientists and decision-makers have to listen to their gut feelings. 

That may sound like some kind of mythical sixth sense, but it couldn’t be further from the truth. Sometimes a feeling is just our brains subconsciously raising an alarm because it processed something we aren’t aware of. Recognize those emotions and take them as another data point.

Explain Your Methodology

Data scientists are sometimes treated like a witch or a warlock sitting in a corner mumbling and writing strange sigils (the mumble and the sigils might be true, but it has nothing to do with magic). Aside from telling a good story, it’s always a good idea to explain how you built that story. In other words, we should talk about our methods.

For example, we all know that data can be biased. This is a problem that’s been getting a lot of attention lately, as companies have been accused of using data in ways that discriminate against certain groups of people. For example, if a company is using data to target ads, it may inadvertently end up targeting ads for products that are more likely to be bought by people in one demographic group over another. This can lead to unfairness and even discrimination. 

What did you do to avoid this? How did you clean your data? What safety measures did you take to make sure that your AI won’t become self-conscious and enslave mankind (that’s a joke, but even billionaires think that’s a possibility). The more open and upfront you are about your methods, the more likely that people will be willing to trust the results.

Why Is Data Important?

Data can be used to help us make decisions, understand our customers, review our progress, and pivot when necessary. By analyzing data, businesses can figure out which products or services are most popular and make changes accordingly. Data can also be used to track customer satisfaction levels and identify areas where improvements are needed.

Data can also be used to plan for the future. For example, if a company knows that its sales tend to increase during the holiday season, it can stock up on inventory ahead of time so that it can meet customer demand. Similarly, if a city knows that its population is growing, it can plan for new infrastructures such as schools and roads. 

Finally, data can be used to see how things have changed over time. This is important for both individuals and organizations alike. 

To make the most out of our data, we need to understand what it’s telling us. That’s where data analytics comes in, but it has to go hand in hand with a clear communication channel, trust toward the data, and a data-driven culture that is willing to create data cycles capable of helping us grow and become more efficient. 

Article tags:
Rocío Belfiore

By Rocío Belfiore

Chief Research and Development Officer Rocio Belfiore manages teams of specialists and heads all internal software development, from big data projects to business intelligence algorithms. Her department's cooperation and conviction contributes to BairesDev's continual growth.

  1. Blog
  2. Technology
  3. From Data to Stories: How to Speak Data Science

Hiring engineers?

We provide nearshore tech talent to companies from startups to enterprises like Google and Rolls-Royce.

Alejandro D.
Alejandro D.Sr. Full-stack Dev.
Gustavo A.
Gustavo A.Sr. QA Engineer
Fiorella G.
Fiorella G.Sr. Data Scientist

BairesDev assembled a dream team for us and in just a few months our digital offering was completely transformed.

VP Product Manager
VP Product ManagerRolls-Royce

Hiring engineers?

We provide nearshore tech talent to companies from startups to enterprises like Google and Rolls-Royce.

Alejandro D.
Alejandro D.Sr. Full-stack Dev.
Gustavo A.
Gustavo A.Sr. QA Engineer
Fiorella G.
Fiorella G.Sr. Data Scientist
By continuing to use this site, you agree to our cookie policy and privacy policy.