Authored by 20 AI + Data Leaders

Modern Data + AI Integration:Strategies and Architectures

Free Download

Revolutionize Supply Chain Management with Nexla–Future Chain Podcast

Transcription:

Meagan Cunninghan (Host)Hello everyone you’re listening to future chain your source for thought leadership on new technologies in supply chain management. I’m your host Meagan cunningham. Our future chain team has worked with AI, ML, NLP, and predictive analytics applications in industries from advertising to telecom. We’re gonna talk about all these technologies and how they bring value to supply chain management. We’ll also discuss the overall evolution of SCM. Our guest today is Saket Saurabh, the cofounder and CEO of Nexla. 

Nexla is an enterprise data platform with the mission to make data ready to use for everyone. They do this by bringing automation to data engineering tasks like integration preparation and monitoring of data. Some of Nexla’s notable customers include LinkedIn, Door Dash, Poshmark, and Bed Bath and Beyond.

Recently recognized for its innovation and modernizing enterprise data and operations in the channel companies big data 100 for 2022 Nexla has also been called a cool vendor by Gartner.

Saket, welcome to Future Chain we’re excited to talk with you. First of all, I’d love to ask you: how did your passion for data integration begin?

1:35-4:56: How Saket’s Passion for Data Integration Started

Saket: Hi Meagan, thank you so much for having me here. It’s a pleasure. I would say that some of my passion and this really comes from what I would say is a fundamental obsession with technology OK I’ll admit that here. It’s all about understanding the nuts and bolts of complex systems you know that’s what I I did as a CS student. And that actually translated into a focus on you know computing systems. I was at companies like NVIDIA. That actually did not intersect with data until I was actually in business school. 

And it was during my MBA program at Wharton which is actually a very deeply technical program in like finance and operations and marketing right and that’s where I think the math geek in me gained a new perspective of seeing business through a data lens. I sort of got drawn into that inside of the world of data into it. I would add that starting a company in the world of mobile advertising back in 2009 when that space was pretty early leading me into the deep end of data. 

Meagan: Wow what a journey. Another thing that I have learned recently about you is that you’ve had some experience working with the World Bank. Would you be able to talk about that with our listeners?

SaketYeah yeah absolutely. So I think when I was in business school and even now, I have always been like deeply drawn to social causes. So it turns out that in 2008 I was you know through the first year of my business school, and in the summer I was fortunate enough to land an internship at the World Bank. I was based in Dehli and a lot of my work there was focused on agriculture and insurance in India. And yeah it’s a huge thing actually and I had not known that space. Insurance, you know is very data-driven. When we think about agricultural insurance it is about conditions like weather and predictive events. And you know providing insurance to that. So it was incredible just seeing the work that they were doing or trying to do at the time and how it could impact as you know literally millions of or hundreds of millions of people in the country.

So yeah I would say that it yeah it was such a delightful experience that sometimes I think I might still have been doing that actually if it was not for my wife deciding to join an MBA program herself in 2009 that brought me back again to the US and yeah and then the course of life changed for me.

4:57-7:50: Challenges of Data Integration (Poshmark Success Story)

MeaganWell that’s a very very interesting experience and I’m sure in some ways it still informs the work that you do today. We’re happy that you did come back here. We’re so excited to learn more about Nexla and your work. I’ve noticed you have a very impressive set of case studies on your website. We’d like to learn more about your product offering and platform by discussing how Poshmark used it and the ROI they received. Now since Poshmark is a marketplace that aggregates buyers and sellers with siloed datasets, I imagine the first challenge was data integration. Would you tell our audience about this process and the challenges you faced?

SaketYeah yeah absolutely and I think of course this applies very much to marketplaces like Poshmark because they naturally sit you know right between millions of buyers and sellers. But the way we also think about this is that you know every business has to now think about data that is internal to their business. So for Poshmark, it could mean you know people using their application or visiting their website and looking at products and liking them and sharing them and commenting on them and all of that stuff that happens that you know needs to purchase and transactions right. There’s a lot of data that companies have internally you know. But then almost every company has a huge ecosystem loss tools and partners and customers and suppliers and everybody else that they work with right. 

So think about someone like Poshmark you know a common tool that a lot of companies use is a customer support system. They’re not gonna build that on their own. They go buy a customer support system so that they can attract questions issues from customers and and support them right. Another one is like shipping packages. You’re tracking them, and it’s going through different carriers in different countries you know. So if you look at every aspect of business, there is data that is coming in, and naturally integrating that data becomes important.

I think that it has become even more important today because companies realize that they could you know look at the data that they have internally. You know like who is visiting my website, who is using my applications, and what products are they liking, and then correlate that with other data like who is requesting certain types of help or support or scheme issues with their package or delivery. They will bring that all together you can actually look at business in a much more fine-grain way versus trying to take reports from five different systems put that up on your screen and figure out what’s going on because now you don’t have that connectivity of information. So integration becomes almost essential as a foundation, if you will, for many other things you do.

7:51:-10:02: How Integration Makes Analytics Better for Companies — “Three Things”

MeaganIt actually leads into very well one of my next questions which were going to ask you about how integration makes analytics better for companies and what it’s possible in terms of analyzing. I also would like to ask you how many formats you have to work within a marketplace.

SaketSo there are three aspects which I would say that’s a triangle of complexity with data. it’s the format– so formats could be the different mechanisms of ways in which data is represented or it could be in a document, it could be in a spreadsheet, it could be you know a structured JSON object. There are different ways in which data can be formatted. That is one aspect of the triangle. 

The other aspect is what are the data systems: customer support and supply chain systems and point of sale and all of those systems that in an eCommerce would make sense. And then the third side of the triangle is the velocity of data. Some systems are very real-time. For example, you order the product and you get an e-mail saying here’s your order and you click on the tracking order button. That data is real-time. It’s based on like where is the package currently. So that is a different velocity than saying that I’m gonna look at my daily sales report and see what’s happened. That’s more like once a day. 

These three things–the format, the system, and the velocity… To answer the question, we actually cover a very very wide and almost complete range across all of those three. And that’s actually one of the things about us is that our approach to data technology is so fundamentally new– it’s so innovative. It’s built on fundamental intelligence that we can derive from data and that’s why when you mentioned we’re a Gartner Cool Vendor– it’s actually because of that reason. So sometimes people are surprised like “how could you support that many things?” And we are like” well we just do it differently.”

10:03-12:32: What Is Data Streaming?

MeaganWell here’s to being different. That’s excellent! Can you talk a little bit about what data streaming is?

Saket: Yeah I mean traditionally you know a lot of data analysis or analytics have been done periodically. So daily, you wanna see a dashboard of sales. So that is when you go every day you fetch data. That’s like batch is what we call them. A batch that happens every day. 

If you get some operational scenarios, as you mentioned, one of our customers is Bed Bath&Beyond. You go to their website to search for a product, you like it, etc. You can do two things: you can say ship it to me or you can say is it available at a local store. The moment you say it’s available at a local store, how are you getting that is because the data flow between the inventory system and the point of sale system and the website that we’re browsing is all done continuously. It’s not happening once a day. The availability of that item at the local store might be three units in the morning, and it may be two units in the afternoon. 

So this is what we call streaming which means that the connectivity of data is flowing. Think about taking buckets of water and moving it in batches versus just a stream flowing. So that’s where streaming comes into play. 

What we’re finding is that if you bucket data use cases into broadly three things you know one is analytics–looking back at how business is doing. The second is predictive or data science which is like trying to take that historical information and say what’s going to happen in the future. The third is operational, meaning is this product available, order tracking, and stuff like that. We find that more operational use cases are slowly becoming streaming. We actually saw that happen during COVID. If you remember, toilet paper would run out. It would come in the morning it’s gone in the afternoon. So our customers like Door Dash and Instacart deliver products. Now you can no more rely on every day I will get a report of how many of what products are available in the store because now it could be gone in hours. So you now have to start immediately thinking that you know what the thing that I was doing once a day actually needs to become streaming and real-time.

12:33-15:37: What Is the Interface for Engineering if Hard Coding Is Not Required?

MeaganWonderful, thank you for that answer. What is the interface for engineering if hard coding is not required?

SaketYeah I think coding has been the interface to data for most of our history of working with data. It was done by IT and engineering teams. When we say ready-to-use data somebody is writing code and making the data ready-to-use so data is being delivered to you in your dashboard or in your spreadsheet. But somebody’s doing the work of connecting data from all these different systems and putting that together. 

What we have to recognize now is that today everybody needs to work with data. You could be doing recruiting at a company in HR, and you need to work with data. You could be doing ads and marketing. So every function needs to work with data. 

That’s one part and just like everything else data has this 80/20 rule in the sense that 80% of these use cases or problems can be prepackaged and become no-code. So when you ask about hard coding for engineering, our whole approach is to get most things to be no-code and easy for people, so most people can go and use it and not be intimidated by that.

However, there will be 20% of use cases that are complex. They cannot be just handled based on past patterns. And then there are three things coming to play. One is intelligence–like can we intelligently understand things. That alone cannot solve all problems. Then comes low-code: can you write a little bit of code five-ten lines and make something happen in a big ecosystem? Most of the things that I can read the data I process it I can analyze the data. But I need data validation and I need to write three lines of code to create a data validation rule. So that is no code for example.

Then the last bit is a platform. This is where the engineers come into place. So when you ask for the interface for engineering, we believe that it’s good to give them a platform where a lot of things are already taken care of by the platform: managing credentials, having streaming frameworks, and being able to scale performance. But if you’re an engineer, you can actually write code to leverage these primitives and solve the business problem. You can complete tasks a lot more efficiently and a lot more reliably. So that becomes the interface that we call the platform. The interface becomes APIs; interface becomes then SDK; interfaces are also command-line tools things that allow you to take our power and capability into a command that you can insert in other places or run programmatically and all that stuff.

15:38-18:25: What Levels of Errors Are You Seeing from Your Data Monitoring?

Meagan: Excellent! What levels of errors are you seeing from your data monitoring?

SaketI mean our job, in terms of monitoring, is to just report what’s right and what’s wrong. In those cases where we’re monitoring, it very much depends on the underlying quality of data, which then depends on where is the data coming from.

Generally, as you can expect, data is generated by machines–for example, you bought a product and a tracking number was generated for that from a package provider like UPS. While those things are programmatic or automated systems, there is usually not a lot of data error in there. But when there is human entry in there, you will see errors happen. Don’t get me wrong, even programmatic systems have data errors. The primary reason for that is that the systems get programmed, and so for example again if you think about a package being delivered in this case, their engineer went in and put some new code and said, “so far when we tracked packages; we say the way it was shipped and which places it arrived at and all that stuff. I’m going to add some additional information in here to enhance that capability right.”

Maybe we get GPS location in some places. Sometimes they get the GPS location of the truck so we can then actually where the package is like right at this point, let’s say that’s what they’re trying to do. Now what happens in those cases and we program new systems and we update them, there are often errors that happen. Turns out that you know some of the errors are happening because the particular truck doesn’t have a GPS. The system will report some erroneous data. The default I think in most location systems is that if they can’t find the location, it actually sets the location in a place in Kansas as a default GPS in the US. That’s why sometimes when you look at the massive amounts of location data, and we saw that in the advertising world, was that somehow you see there’s a huge number of audience of people in this very small place in Kansas. That’s what’s happening and like all these GPS locations are wrong and that’s why they all sent on this default value in there. So interesting things happen even with the programmatic data. Ultimately for monitoring, it comes down to you observing the patterns of data to be able to trigger and notify the person that this looks like erroneous and sometimes it needs user intervention. 

18:26-20:44: What Impacts Data Integrity — and How?

MeaganRight and how is all that impacting the integrity?

SaketData integrity is 100% dependent on the quality of data and how you’re using it. Ultimately let’s say you know use your sales report every day. The integrity of your report and what you’re seeing is not just a function of what was entered. Maybe all the data was correct, but some calculation or some analysis or something was off. Like if you connected with the wrong dates and stuff like that. So you know integrity is I would say a function of that– of what is the quality of data and also sort of figuring out how it’s getting used.

There’s a big effort right now going on today in the world of what we call it lineage of data. We look at a piece of data and understand what journey it takes, where did it start, where did it go what processing happened, and where did it come from, so that when we find errors we can go back and trace that, “oh looks like this system somewhere is having this issue, or this logic here is periodically making the wrong calculation.” This happens all the time by the way. Like when some price entry in the product is let’s say $0.00 and somebody did some calculation, then the results are inconsistent. So there is a huge amount of stuff happening on lineage.

I think we are we might be the only company or one of the few that is able to give that lineage insight and for every record. Normally people have that insight about big blocks of data, but not one individual piece of data to be able to trace it at that level. So we are able to tell people like this data piece actually comes from this API call or from that line in that file and there’s been through this journey. So if let’s say the price is wrong or something or the image is wrong, we can actually trace it back to the origin. See if you get the wrong data in the first place maybe from someone or was there some processing mistake in between. 

20:45-23:24: On the Insights Nexla Delivers to Poshmark

MeaganWell that’s very powerful information for you our customers. Back to talking a little bit about your specific experiences working with Poshmark, how many more insights would you say that Nexla delivered to Poshmark than the company might have generated without using the solution?

Saket: What we do is we make the data ready to use, and it ends up in their data warehouse and the tools, so I wouldn’t exactly know where all the data is eventually getting used–we’re not sort of into that. But we do hear from our customers at Poshmark what they have been able to do. One of the key things that they were able to do was plugging into the customer support system and get very fine-grained event-level data from that, and really be able to understand where are their services performing really well,  where are the areas of improvement, who can solve what kind of problems better than other people when it comes to a support question. Then they were able to connect all of that into how is all of this impacts customers’ behavior, their loyalty and their lifetime value, and all that stuff. In that line alone when I said that it’s like multiple places lifetime value is one part; assigning agents to support issues is another part.

There are many areas, but I wouldn’t have an exact count. But over time what we have seen is that those cases over time more and more places. Once people learn that and say “hey wait a minute I was able to do this that used to take months to get done, but now I could just do that in a couple of days or even hours!” So that actually opens up people to use our system in more places. Over time as I mentioned has been everything from internal enterprise systems which is how they run and manage their business to the marketplace to order tracking to things like data that drives marketing as well for example sending emails saying are they delivering or not.

23:25-24:28: Describing the Ideas from the Flow of Data Pipelines — Analyses to Insights to Decisions

Meagan:

That’s excellent! Besides being you know incredibly time-saving I’m sure for many of these processes with Poshmark, I’m wondering if you could describe one of the ideas generated by Poshmark from the flow of data pipelines: anything from analysis to insights to decisions.

Saket: Yeah I think certainly we covered in our case study about how they have been able to understand and improve the customer support aspect of it.  I’m thinking about what other ideas have come across…you know some of the ideas have been around on the marketing side as well in terms of the e-mail campaigns. Let me think about what we are publicly allowed to share. Since we do get used in many applications, some of them are sensitive to business.

24:29-26:39: The Future of Spreadsheets — Will They Stay?

Meagan: On a kind of a general level, do you think that executives just in the marketplace will continue to use spreadsheets or obviously when there are more robust reporting tools out there?

Saket: I think let me say that people will continue to use every tool that they have, and new tools will come and old ones will stay. That’s been the state of technology in general. But spreadsheets in particular, while we think of them as like very basic because they’ve been around for such a long time, they’re incredibly powerful, especially for executives. You actually see more and more executives that eventually want to see the data in a spreadsheet because one thing that spreadsheets allowed them to do is do scenario analysis and modeling. So you can look at the report and say “hey this is how things are like. What will things be like if I made this change?” That has to be modeled and the best way to model that often is in a spreadsheet where you are able to pull that same data into a spreadsheet and give them a few variables to try out and see what the impact is.

Spreadsheets are gonna be there. Spreadsheets are an amazing piece of invention in the data space. I mean they don’t require you to be a programmer, but you can go very far in terms of what you can do with their built-in functions in terms of the flexibility that coding gives you. You sort of get that very non-intimidating framework and for various reasons, almost everyone who has gone through certain levels of business function has ended up getting into spreadsheets whether you know by working on it or through formal education like learning to create financial models in business school. 

26:40-29:10: What Does Data Integration Make Possible in Terms of AI?

Meagan: Yeah well wonderful that we can you know continue to utilize tools that we’re familiar with as well as each new solution. And on that note what is data integration makes possible in terms form of artificial intelligence?

Saket: Artificial intelligence is like you know just feeds on data. The raw material for that whole innovation and all that sort of rocket ship that people are trying to build powered by AI is hugely dependent on data. We have seen that actually one of the top two banks in the country, a Nexla customer, that they have built an enterprise-wide BI platform with Nexla as the data stack. One of the things that happen with that world is that it is a world where you experiment a lot. You have to try various ideas and see sort of what fits in and what is high value. For example, let’s say you are trying to predict your sales or something in the coming six months. I mean there are changes in the economy there are changes in patterns and all that stuff. Now you can’t do that prediction let’s say for the next six months based on the last six months of data because the same six-month period last year was slightly different. More people were at home; they were not out there; the economy was different and so on.

So when we think about data scientists, they are constantly thinking about “is this a good idea? Could I combine maybe census data to see if this makes sense? Could I take a look at travel data and then try to predict how people will end up buying their product or not because if they’re traveling on vacation?” There are so many ways in which you have ideas and all of them ultimately depend on can I grab that data, bring it into my system and look at it all together. I would say that many data scientists have found that the hard truth is that they end up doing a lot of work.

The bulk of their work becomes trying to integrate data, trying to bring it into one place, trying to make it in the right format so they can use it, trying to extract the features from that, and trying to process the training data. We’re constantly finding that Nexla becomes like an accelerator for people. Instead of spending months trying and seeing whether it works or not, people can actually do it in two days with Nexla.

29:11-30:39: Who Benefits Most from Nexla’s Solution — Engineers, Data Scientists, or Executives?

Meagan: Amazing so on that note my last question for you today Saket is who do you think benefits most from Nexla’s solution: engineering, or data science, or the executive suite?

Saket: It’s a whole chai. So the way we think about this is that: number one, today, engineering is the one who has to deliver the data in a ready-to-use form for the different users. They are the initial beneficiaries–they are able to tell people like “hey I don’t have to be the bottleneck. I don’t have to solve all of these things. You can go to use this tool, and most of the things that you asked me for you can now do it on your own. So you suddenly empowered that person to not be dependent and if they remove that one big piece of friction and back and forth. So that’s absolutely the number one personal benefit. The downstream effect of that is that now this person is not waiting on an engineer to find that data to apply it in their model. Now they can do their job faster and better. They are not waiting for most of the things. That benefit sort of flows further along in the organization. That’s how we see–it’s all interconnected. It’s not like one sort of team in one silo is just doing this and nobody else is impacted.

30:40: Outro

Meagan: Well very impactful for the entire organization. I wanna thank you very much sacred for joining us today. Thanks to you our listeners please visit your favorite podcast platform to subscribe to Future Chain and give us a review also feel free to e-mail us at info@futurechain.org with any feedback, questions, or guest suggestions we are building a resource for you. Until next time.

 

Unify your data operations today!

Discover how Nexla’s powerful data operations can put an end to your data challenges with our free demo.