March 1, 2023

10:45 am - 11:15 am PST

Why is no one visiting my lakehouse?

Unlocking the power of your lakehouse for low- and no-code users The modern data stack has been evolving in recent years, allowing for broader access than ever before for people who have the skillset to leverage your lakehouse. However, that remains a small portion of your overall team, and if you want to get the full value out of your data investments, you need to unlock the front door of your lakehouse for low- and no-code users. Come learn best practices for enabling new users to access your lakehouse without unnecessary duplication of data.

Sign up to watch all Subsurface 2023 sessions


Note: This transcript was created using speech recognition software. It may contain errors.

Mohammed Aaser:

One of the things that, you know, I’ve always been thinking about is organizations have invested tremendous amounts of resources in building up their data lakes, their lake houses, and so on. But often when asked, you know, how much value they can point to some examples, but they often say, well, there’s a tremendous amount of additional value that we want to achieve from these existing investments that we’ve made. And it really inspired me to go on a path to understand the different personas that are using data and then the value that they’re getting. So a few months back, I ran an analysis on one of the largest C P G companies. And, you know, on LinkedIn, they had over 23,000 employees. And I wanted to dissect the different types of roles that this company has. And when I, when I took a look through, I found that they had less than 1% of their staff that are data scientists or engineers.

And, you know, they’re working on you know, maybe they have their spark clusters and they have the ML ops platforms, and they’re working on some really high value use cases. In addition, they had about 2% that were bi and data warehousing specialists. These are folks that are, you know, creating reports. they’re in some cases creating some data sets, doing ETL jobs, data architecture, standing up the warehouses, and this 3%. and I wouldn’t be surprised if some of those on the session today are part of this 3%. They’ve had a very large impact on the business. in many cases they’ve had notable wins. However, from this 3%, often what I hear when I discuss with them is that the, the needs that the organization have far outstripped the capacity of this 3%. And, you know, kind of diving a bit deeper, you start to see that 97% though, are in the line of business in marketing, sales, finance, operations, you name it, right?

And many of these teams, they’ve invested in some business analysts. And I found three unique things about this, 97%. The first is that they often have data sources that they use that may not be in some of those investments that have been made by the 3%. They have their own line of business SaaS tools. The second thing that I observe about this group is that they’re often closest to the problems they’re seeing out of stock issues, and they’re understanding the impact on their bottom line faster than almost anyone else in the business. And the third is they live and spend most of their time on desktop tools, excel, quite frankly. And so the question that I’ve always wondered is that 3% did have this great impact, but the needs of the organization often far outstrip the capacity of that 3%. How do you empower the 97% to truly tap in to all those great investments that the 3% have made and solve problems more rapidly?

And that led me down a path to, to understand a bit more about what that journey for the 97% looks like. And so, you know, Brett did a brief introduction, but I’ve been in the data and analytics space now the last 10 years. And some of these are my observations, spending time at McKinsey and leading their analytics as well as Ameriprise Financial. And now at Domo, we have a chance to work with hundreds of our customers. So when I do a deep dive on that 97%, I mean, the biggest thing that I observe is that they often face what I call a disjointed self-service data journey. And I’ve seen three flavors of this data journey f for the 97%. The first one is, is the desktop journey where you take extracts from your line of business tools, Excel files, maybe even from the data warehouse, just literally extracts.

And then you join it together in some desktop tools. You visualize it, and then you embed it into emails and PowerPoints, and you mail it around. And we know some of the challenges. It’s hard to get refresh data into these formats. It’s hard to have governance on top of this data. It’s also hard quite frankly, for people to be working off the same data when they’re making decisions. The second version I’ve seen of this is organizations that have more, you know, forwarding organizations saying, Hey, we wanna have the best of breed cloud based tools. And so they’ve invested in data catalogs. They’ve got a data lakehouse or a warehouse set up, there’s Excel extracts, they have their SaaS tools. They have a cloud data integration tool that they use that they push back into the lakehouse. They then take that lakehouse and connect it to a visualization tool.

They take that, the visualization, publish it to the visualization. Well, if this sounds complex for most of the 97%, it is because there’s so many steps and so many different tools they give up. And what ultimately happens is that they revert back to more of a reporting. They go back to the 3% and say, can you help me with this problem? They write an email out that team builds, you know, a data set and analysis, you know, and often what you find is those requests get prioritized. They get prioritized often based on seniority. more so than importance in terms of the value of the analysis. But in any case, that analysis gets created. There’s several iterations. once that analysis is created the team goes, and the line of business team looks at that analysis. And you know what? They do it, they extract it to Excel and they put it into a PowerPoint.

They email it around. And in my mind, I’ve always thought, there’s gotta be a better way. And so that’s, that’s where Domo comes in. Domo enables a friction-free journey for, for that 97% to, you know, enable organizations to multiply the value of their existing data investments. So how this is done is, if you think about that journey from connecting to the data sources, whether it’s your data lakehouse or your SaaS tools that you have, or existing certified data sets, or you know, the tremendous amount of Google and Exceled spreadsheets and other CSV extracts, we allow business users to be able to connect to, to those data sets. In addition to those wall governed data sets. They can then set up joints in a drag and drop way, and then automate those data flows so that the data remains fresh and people are working off consistent data in, in real time.

Once that data is prepared, they can start to build visualizations and go beyond visualizations to even create stories, almost like menu websites where that are accessible on their mobile phone with images and text that allow business leaders to, to send insights to people and truly influence changes in decision making. And when those insights are creative, there’s always this question around governance. So we have a certification process. So data teams as well as business teams can certify which data sets are relevant and they make sure the right people have access to only the data sets that they’re allowed to. And then finally, once that data is, is together, businesses need to make decisions. if, for example, accounts receivable needs to have an alert on cashflow related questions, they can receive alerts when payments are made or when payments should be made.

For example, teams can share data with one another as well as insights enable access on the mobile phone, and then bring that data right into where they work in PowerPoint and Excel and Google Sheets and so on, and work with that data and then publish it back in to, to their environments. And the beauty of this is that when I’ve seen organizations, especially as they mature they start with their data lake house where they aggregate a number of their very valuable data sets, but they start to move towards this ecosystem where the 97%, actually the different lines of business teams, they stand up their, and they get their own Domo instances, their own Domo environments. And because of this, what it allows is that these different teams can then tap in to the great investments made in the data lake, either by federating querying the data directly in the data lake, or in some cases where it makes sense, also cashing that data into their, their Domo environments. What they can also do then is across the different teams, they can share data sets insights. And this allows ultimately for the teams to have governance on top of the data that they have and creates a data ecosystem within the organization.

So, you know, I’ve talked about this from a technical perspective and also some of the business challenges I’ve seen. I wanna really bring this to life. The example that I often think of is Samantha. And Samantha is a product leader at a tech company. I consider her to be a data athlete. I had an opportunity to, you know, analyze some of the world’s greatest athletes. And I found that they had these five characteristics. And this is not uncommon for many folks in the line of business to have some of these characteristics and these types of profiles truly transform the company. So let me walk through her example. First, it’s starting out with a mindset to perform and win. She came into this organization with a vision to say, I wanna help, or I wanna have the most profitable and fastest growing product line in the business.

 she, you know, she participated what I call obsessive practice, right? Athletes kind of learn all their skills, they train and practice a lot. So what she did is she actually spent a ton of time understanding what’s all the data available around how customers are using our product. and then she started actually joining a lot of those data sets together in Domo herself. and that allowed her to create a state of the state, you know, what was happening with the product today. A lot of the athletes also created new moves, new plays. And so in the same vein what Samantha did is she had curiosity and that led her to new hypotheses. And so she spent an hour each day completely unstructured investigating the data testing hypotheses. It led her to create a new customer segmentation model and marketing tactics that they should take.

She then also built a deep understanding of what was happening in the market, including the competition. And she, she started to both interview customers as well as do surveys to understand why they were joining the product, why they were leaving the product. That influenced her tactics as well. And then she, she didn’t work by herself. She worked actually very close with the tech and data teams. She worked with the tech and data teams to get her new data actually that was coming from their tech pro, from their from their platform. And she also went back to the software engineers to say, Hey, I need some pieces instrumented in our products, so that way I have the data so I can, it can help influence my segmentation in my tactics. So she brought it all together and, and the results were quite impressive.

 she, her product grew 30% year over year. And what was most interesting is that in fact, post covid, the company’s overall revenue slumped 15%. Her product line went up 30%. She became the most profitable. She had the most profitable product line in the company, recognized by her leadership. And then ultimately, you know, she was a great role model. All the leaders basically said, Hey, go talk to Samantha. You’re working at a product line. Look at what Samantha’s doing and try to bring that into the product line you run. Now, I’d love to say that Samantha’s the only example, but in fact, I’ve interviewed dozens of these types of folks who are in the 97%, and they’ve used tools like Domo to transform their business. Beth who’s at a c PPG company she basically brought together data from across all their different systems and tools from their great investments that they’ve made.

And she helped identify huge improvement areas in terms of their margins. And so in within two years they were able to see margin improvement by several points. Yasha, who’s a research company basically said, there’s gotta be a better way than just writing out research that we’ve traditionally done. We gotta actually create data products. And so he worked across the organization to build over a dozen new data products, which the company’s now monetized. George, who works at a, at a manufacturing plant, basically said, I wanna understand all of the different things that are happening in the business from the uptime and the downtime of the machines to employee productivity and so on. And he saw an, an increase in over 10% plant productivity. It actually yielded a massive savings and energy costs and was great for their sustainability as well. And Travis, who was at a retailer, basically said, Hey, there’s gotta be a better way for our store managers to understand and run their business.

And so he brought together all the data around the inventory and customer satisfaction, and he created a view for the store managers to better run their business. And after piloting it, they saw over a 5% improvement in their sales productivity and top line, they rolled it out to all of their stores. And what’s really interesting about all of these data athletes is that none of them knew sql. In fact, none of them knew how to code. None of them, you know, none of them came from traditional tech or data backgrounds. They all were able to basically leverage their curiosity and use platforms like Domo and tap into all the great investments that the organization has made. So with that, I’d like to transition over to my good friend Ron Caris, who is a senior solution consultant here at Domo, to walk through how this self-service data journey actually looks in real life, demonstrating how organizations can take their great investments in Dremeo and enable their 97% using the using the Domo platform. So with that, off to you.

Ron Caris:

Hi everyone. I’m Ron Caris. I’m a solution consultant with Domo. I’m gonna start this demonstration in dremeo. I’m sure everyone here knows about dremeo, a great lakehouse product that allows you to virtualize massive amounts of data from multiple data sources. This is a great product to assemble all the data you need across the various data sources, but for that 97% of users that Mohammed was talking about, it may be challenging to use this to find the data to serve themselves, to answer the business questions that come up in their day-to-day activities. Here in the dremeo interface, you can see I can issue SQL statements and return data quite easily. Well, maybe not that easily. If you’re a tech non-technical user and you don’t know sql, let me navigate over to Domo. Domo has truly created an integrated experience with dremeo, with Domo’s Cloud Amplifier, we can quickly connect to dremeo and expose that data that can be used by these users in a self-service manner, empowering them to leverage the massive amounts of data exposed in dremeo in order to answer those business questions.

Now, I’m gonna come back to this data in a little while. I’ll use it in my demo, but let’s start off by looking at some dashboards that were created with dremeo data. So this is an example of an operational dashboard built with that data coming right outta Dremeo. This dashboard deals with marketing and sales conversions. A variety of visualizations have been used in this dashboard to re reveal sales performance trends, the things that would be valuable to sales leaders in any organization. And under the covers, we can see here a little further on that. Domo’s data science features have been to identify opportunities that are more likely to close and give sales teams the right focus to be more successful. Truly a great way to leverage data. Now imagine I’m part of that sales organization working with this dashboard, and my boss walks into my office and says, Hey Ron, I need a report that shows me commissions we spent broken down by regions and sub-regions.

I need this to report back to my finance folks. Well, in any other platform, this might be a daunting task for a non-technical user, but in, don’t know, it’s actually quite simple. Let me navigate right back to the data center over here and find a data set of type reio that might satisfy this request. I see this marketing data set here. It’s actually certified for self-service use. And, and this is a certification process. It’s very simple. In Domo, you can apply it to cards and tables, gives users the confidence that they can use this data. I can sample the data without issuing a sequel statement. You know, I can do things like filter on it, I can look at stats, but I’m interested in the schema. I want to see if there’s the right data I need to create that report. I can see there’s some missing elements, but you know what?

I’ve got a spreadsheet with some additional information that I can blend this data with to produce that report. Now, later on, I can go through the process of having this data approved and placed into the lake house, but for now, I want to get that report out quickly. Now, I’ve already uploaded the file using this file upload capability in Domo. I just dragged the spreadsheet onto here and followed the wizard through the process. I could see that’s this file upload right here. But the, the next thing I really have to do is create a transformation job, an ETL job that joins the data with the, with Dremeo, right? So Magic ETL is a very powerful tool that can be used to combine data, to cleanse data to, to create calculations and the like. And you’ll see here, all I need to do is take these action tiles and place them on the canvas and then tie them together in defining the process.

I need to blend the data and do whatever I want. This can be as simple or complex as you want. All I need to do here select the specific configurations for each of these tiles. In this case, I’m choosing the correct data sets as input. I’m gonna join them based on a common join key, and that’s that account rep id. And then I’ll just have resolve that. And then I’ll create a calculation that represents the commission rate multiplied by the deal amount that’s over here. And I’ll call that the com amm t I’ll save and close this. the next thing that I’d have to do is name the data set and then define the schedule upon which this will run, or whether it’ll run every time the data is updated. Now, I’m not gonna save this. I’ve already done this in advance.

I’ve produced the output data set so we can get to it quickly. And I could see that data set is right over here. It’s got all the data I need. once again it’s a combined and blended data set. But really what I want to do now is create a visualization with that, create the report that my boss was asking for. And so when I click on create a visualization, I’m hearing the analyzer. We also call it the card biller. It’s just the tool we use to create reports, charts and graphs, whatever you wanna call them. And you can see when I get in there based on algorithms, it shows you an example of what you can build with the data. This is not exactly what I want. So I’m gonna throw this away. I’m gonna select a horizontal bar chart from the some 200 charts that are available through these chart categories on the right hand side.

And then I’m gonna search for my region dimensions so I can place them on the y axis and the series. And then I’ll find that commission amount and drag it over here. And there you have it. I have my report. Let me just do a little bit of cleanup here. I’m gonna format, these are currency amounts. And guess what? I think the summary number could, should be a, let’s make the summary number, that commission amount and the sum of the commission amount, and I’ll format that as well as currency. I’ll give this report a meaningful name, we’ll call it the commission report. And I will save this to a page that I’ve prepared in advance to show you how easy it’ll be to build out an interactive dashboard. So let’s just quickly navigate there. And you see here I’ve also prepared a, a couple of other charts in advance.

And I’ve got an image here to make it a little pretty. And, and so what I need to do here is just design the dashboard and it’ll bring me into the the dashboard editor. And guess what the skill level necessary here to assemble an interactive dashboard is about the same as building a PowerPoint presentation. So I just need to, I’m just gonna format things around here so that they look good. And in fact, as I build something for the web, for the web interface, I immediately build it as well for a dashboard on the on, on the mobile interface, right? I can, I can alter this if I want to, but it looks good to me. But now I’ve got a, a dashboard that’s available both on mobile device as well as on a browser. So I’ll save that and I think we’re good to go.

Now, I know my boss wants to be alerted when when new deals come in. So, you know, out of this visualization, I can create alerts. In fact, out of any visualization I can, this bell icon allows me to get access to a wizard that I can define these alerts based on any charted item. So I’m gonna choose a summary number, which is my amount of deals that come in by day. And if it increases by one, I want to send her a message. I’ll click on next. this is a configurable message, it’s appropriate for now. And then, you know, I can select her from a user list and send it out to her. I’ll save this. I won’t send it out to her just yet. I’ll just save it for myself. And now I will get a message every time a deal is closed and I can act on it.

And then certainly she can act on it as well if she cr creates an alert for herself. So there you have it. Well, I hope I’ve demonstrated how easy it is for non-technical users to leverage the massive amounts of data that can be exposed by the Dremio Lakehouse. We can also eliminate bottlenecks and delivering valuable intelligence to business users even if that data doesn’t currently reside in the lake house. So in a lot of organizations, this type of agility is necessary to remain competitive in a challenging business environment. Well, thanks again, and I’ll pass it back to Mohamed for some closing comments.

Mohammed Aaser:

Excellent, thank you so much. Ron. I’m just gonna go ahead and I know we’re getting close on time. I wanna make sure that we can get to q and a. I just wanna do a quick summary of some of the things that that I’ve heard that I heard and that we discussed. One is that, you know, organizations have made huge investments in, into data, and many have seen results, but often the needs continue to outstrip the capacity of that 3%, that 97%. Those line of business teams, they’re often closest to the problems. They have their own data and they have often, they have difficulty tapping into all the great investments that the 3% have made. And so Domo, as you’ve seen today, truly enables that complete self-service data journey, empowering the 97% to use fresh and updated data sources from the Lakehouse as well as their other data sources to solve problems rapidly. And these empowered business users, I gave you examples or shared examples of these data athletes. They’ve had huge impact on the organization’s bottom line. And for many of us here in, in this audience, I think we are the tech and often data leaders in our organizations. And Domo is one of those key enablers that, you know, when you marry it with platforms like Dremio that truly transform the company and that help elevate all of the great investments that your teams have made. So with that, I want to thank you.