Modern Real-Time Data Stack: Emerging Cloud Architectures for Streaming Data Analytics

Digital disrupters are using real-time data to transform the user experience, offering real-time ETAs, personalization and predictive capabilities. With event streaming platforms like Kafka and Kinesis real-time data is now easily available. But real-time analytics (RTA) to power dashboards and applications continues to be a challenge. Cloud warehouses built for batch simply don’t cut it - they are too slow and expensive for RTA.

In this talk, we’ll discuss the modern real-time data stack that puts RTA within reach of all developers. Here’s what we have in store for the talk:

The modern data stack and key use cases deployed by digital disruptors for real-time insights on streaming data
A simple yet flexible end-to-end real-time architecture for faster development of digital products
Best practices for controlling cost and complexity of real-time analytics at scale

Speakers

Shruti Bhat leads product management and marketing at Rockset. Prior to Rockset, Shruti led Product Management for Oracle Cloud, with a focus on AI, IoT and Blockchain. Previously, Shruti was VP Marketing at Ravello Systems, where she drove the start-up's rapid growth from pre-launch to hundreds of customers and a successful acquisition. Prior to that, she was responsible for launching VMware's vSAN and has led engineering teams at HP and IBM. Shruti has a bachelor's in computer science engineering and an MBA from UCLA Anderson.

Show Notes

Julie Mills:

Hi, good morning. I'm Julie with Rockset and I'm joined today with, by Shruti who is our chief product officer. And she's going to be giving a talk today on emerging architectures for streaming data analytics. I'm just going to cover a couple of housekeeping items before we get started. This talk will be recorded and participants will be muted through the duration of the talk. We are answering questions as we go through this. So feel free to drop those into the chat and we will be answering them throughout. And at the end we'll also be opening up for a live Q&A. I'm going to go ahead and turn it over to Shruti to get started. Thanks Shruti.

Shruti Bhat:

Thank you, Julie. Welcome everybody. We are going to talk about something really interesting today. The modern real-time data stack, I think the industry is really going through a transformation and it's a very exciting time. Just the modern data stack itself is very innovative and gaining a lot of traction and we are seeing a whole different side of it with real-time. So we're going to talk about that in a lot of detail today. Before we jump into that, just a quick introduction, my name Shruti Bhat. I run product and marketing here at Rockset. Our team came from predominantly Facebook in the early days where they really saw how the first generation digital disrupters were able to move a lot faster with real-time analytics. So you have the picture here of our two founders and they have some interesting stories to tell around a time when Facebook newsfeed used to be batch, your newsfeed was compiled every night and you had a fresh one in the morning and how things changed once everything went real-time and how engagement shot through the roof and what it took on the backend to actually deliver real-time at that scale.

Shruti Bhat:

I mean, we're talking about 5 billion queries per second globally. So bringing that experience and saying, what does it take to empower the next generation of digital disrupters without really requiring say thousands of engineers to build this in infrastructure out on the backend? That's really our mission. We are a series B company backed by Greylock and Sequoia. And we have today over 100 customers in production. So a really exciting time here at Rockset. But jumping right into the discussion for today, the modern data stack, I think there are a few very, very interesting pieces out there. If you've not seen these, I highly recommend them. They're coming from thought leaders in the industry. One of my favorite ones here is from our friends over at DBT written by Christine on kind of where the data stack has come from how, it's evolved in the last few years and where it's going in the next few years.

Shruti Bhat:

And of course, real-time being a big part of that. The other one here I highly recommend is from the Data Engineering Podcast, this particular episode 203, if you haven't caught this podcast before, really worth subscribing to. This one is around the design and benefits of the modern data stack, actually a couple of practitioners coming in and talking about how they've implemented it in different companies and what they've seen in terms of the immediate benefits and the design considerations as they implemented this. Mainly takeaway from listening to all of this and being part of the conversations with customers who are implementing this has been mainly around cloud and data. What we are seeing is that with all of the complicated data pipelines becoming more and more simplified in the cloud, suddenly it's changed the game. What does that mean? What do we mean by simplifying it? So just to summarize what we are hearing from our customers, as well as from the industry, when we look at the modern data stack, there are four key criteria to keep in mind.

Shruti Bhat:

And this almost becomes the cornerstone of your evaluation when you're embracing this stack. The first one is whether it's cloud native. Now, there are a lot of different options to lift and shift things from on premise to cloud. But when you do that, what you kind of lose is that, that real ease of deployment that comes from something that's cloud native that's built for the cloud and this notion of unlimited scalability which, again, is only possible when something's built ground up for the cloud. So cloud native is a very, very big part of the modern data stack. The second one, I think, is something which is almost like back to the future, right? SQL compatible. In the past, SQL was almost written offer a little bit where, this SQL is traditional. NoSQL is modern and that whole wave happened. But now we're finding that very interestingly companies are standardizing their stack around SQL. This gives you a few different things. Of course, the standard SQL tooling, all your favorite dashboards, all your existing, whether it's data teams or developers who already speak SQL, they become productive from day one.

Shruti Bhat:

You don't need to go invest in training or invest in new tooling. So SQL compatible is a big part of this evaluation criteria. It helps you to build an end to end stack that is not only cloud native, but also just speaks SQL from start to finish. And we'll talk about how that makes your life and everybody else's life easier. The third one here is slightly non-obvious. I think when you think cloud native, my first reaction is always cloud native means low infra ops, right? You don't have to manage the infrastructure. You don't have to worry about capacity planning. You don't have to think about at what point do you scale, compute versus storage and adding more nodes and servers, et cetera. But the very interesting part on top of that is the load data ops. And this is something that, again, we're seeing from customers really putting this as an evaluation criteria right up front, which is how much data operations do you need day one. And by day one, I mean, things like, do you have to de-normalized the data? Do you have to flatten your JSON? Do you have to do a lot of schema management? Do you have to do a lot of data prep before you can even load the data the first time into your data stack? And then it's not just day one, it's also day two operations.

Shruti Bhat:

Day two operations meaning, if you say build a bunch of pipelines to constantly manage your schema or constantly de-normalized it, what ends up happening is over time when say your developers want to irate fast, or your business ops teams are asking for a new data source. They're asking for new types of queries they wanna experiment with, suddenly everything becomes a two week to three week long cycle even to give them a first stab at it. Even for that first round first hydration, you're taking two to three weeks and that's no longer acceptable. And what the modern data stack really does is it makes it very easy to irate and adapt the change in this new world where your data, the shape of the data is always changing and the types of queries are always changing. So this is kind of non obvious, but something to really keep in mind as you evaluate your modern data stack and kind of transform everything that you're doing towards a more streamlined and clean way of doing things.

Shruti Bhat:

The last one, I think, again, when you talk about affordability, certainly the modern data stack is allowing a lot more data applications to come online. Why is this happening? Because now you can actually process data at massive scale at amazing speed, but without kind of the cost associated in the past. And there are two types of costs that we typically measure here. One is the resource efficiency. Of course, you're thinking about compute storage in the cloud, but also human efficiency. Going back to the Facebook example I talked about before, one of the things we saw was the first gen digital disruptors talk about the FANGs. They all had these massive data teams, right? They had thousands of data engineers building their data stack on the backend. Today we're seeing the modern data stack should allow you to move at that same pace, but without having thousands of engineers at your disposal which most companies today just can't afford to do. So these are the four main things that we are seeing from customers.

Shruti Bhat:

But here's where the conversation changes as far as Rockset is concerned, is that we see all of this was built for a batch world. In the last few years, a lot of the conversation around the modern data stack has been very batch oriented, but what does this look like as you move, not just from on-prem to cloud, which was you yes, modern, but from batch to real-time. So let's talk about that. I want to start with some of the real-time use cases that most of our customers tend to start with. These are the more common ones that we are seeing in the market. And it's really interesting because we're discovering that real-time is addictive. I was talking to, I won't name the company, but this is your favorite coffee shop that is around every corner. I was talking to them and one of their data team members made a very interesting comment, which is, "Gone are the days when they could ask the people in their coffee shop to go look up a system and see what is the inventory and place your order today."

Shruti Bhat:

Today, I mean, blame everything on millennials, I guess, but you have millennials in the store who basically expect your system to figure out that your low inventory, automatically place the order and send them an alert saying that the milk is on its way and will be delivered at 4:00 PM today. That is the power of real-time where a lot of the backend operations are being automated and then the people who need the information at the right time are getting just the right alert sent to them. So that's one example, a few more examples that we are seeing very popular is of course, geo tracking. So you know at any point where your food is, you know at any point where your packages are, so no surprises there. What we are seeing is that a lot of this data actually comes from even sensors. Take an example of your, say, UPS delivery or FedEx delivery, right? A lot of the drop boxes are now becoming smart.

Shruti Bhat:

So the minute you drop something off, ideally your application detects that a package was dropped off and finds the nearest truck and readout to that driver to say, "Hey, pick this up. This particular package is on your way." So those kind of real-time applications are joining data. They're joining sensor data, they're joining information about where the trucks are in real-time and then making real-time decisions. Another example is around fitness. Fitness has gone completely digital. We're seeing, for example, disrupting healthcare by saying, "Hey, what if we could have a leaderboard for fitness competitions and then you get discounts on your health insurance because of your lifestyle?" This is actually a real customer story. A rumble who's doing this today. And then the other one, very obvious one is around online shopping. How do you do recommendations? I think, Amazon and others have set the bar. How do you recommend while the customer is browsing, based on their current browsing, as well as their past history, as well as other customers what's trending right now, how do you recommend what they should be purchasing and personalize that checkout experience at the time of purchase?

Shruti Bhat:

The very, very interesting thing about all of this is it's all leading up to people saying, "Hey, if I can have this in my consumer life, if I can have this in my daily life, why can't I have this in my business?" And we are seeing today that customer 360, kind of this operational analytics on your customers, feeding all the business data in real-time to the people who need it, in the right place at the right time is becoming a major, major trend. So we'll talk about how we are seeing people implement all of this in their data stack today. So here are the key elements of the modern data stack that are kind of, I think, the building blocks to implement this in your environment. Typically, we start with, how do you capture your stream? There are two types of streams we see, right? Your event streams, which typically tends to be click stream data, sensor data, oftentimes coming from something like Kafka or Kinesis. And then the other type of stream that we see which is becoming more and more popular now are CDC, which is database change, data capture streams. Typically, something like Debezium is helping you get CDC streams from your operational database and feed that into your analytic system downstream.

Shruti Bhat:

Once you've captured your event stream, you know how to get this data flowing in real-time. The next step tends to be, how do you trans form this data? Again, two different patterns emerging, both, I would say equally popular today. One is the streaming ETL with technologies like Spark and Flink, which I think have been around for a little bit longer. It's pretty common to say, how do you do stream in, stream out? So you have, say, Kafka, Kinesis stream coming in. How do you do signing queries on that so that you can transform the data as it's happening so that stream is coming in, it's getting transformed and then stream is coming out as a transformed data stream. And that's your classic streaming ETL pattern. The other pattern emerging here is around real-time ELT, which with technologies like DBT is becoming more and more common. So what DBT has done for those of who are not familiar is it's allowed this new phenomenon, which is, well, not only do you just load your data as is into your destination, but you automate a series of transformations inside the database. And that database could be in the batch world, something like Snowflake or Databricks.

Shruti Bhat:

And then in the real-time world, something like Rockset where you have a series of transformations happening inside the database, inside your real-time database which is automated through DBT. So that's about the transformations in real-time. The other thing that we see very often is as you start building more and more real-time applications, they tend to be smarter in nature. There are apps not static dashboards typically, and these apps are very smart. How do you build these smart features into your applications? This is where we see more real-time features and ML models being built. The leading tools you're seeing here are things like Databreaks and Tecton, which I think was built by the Michelangelo team coming out of Uber. But then, eventually you come into the database layer. What is that real-time analytics database? What is that serving layer? This is where Rockset fits into your stack.

Shruti Bhat:

And finally, Reverse ETL is where you take the data from your database. You're doing your search aggregations and joints, but you take a data from your database and then serve that back into whichever SaaS application the user is consuming. Think of this as, say, your sales force or your HubSpot, or even your Slack, which is where your users live. How do you push the right alerts or the right information back to your users? This is where Reverse ETL is making a lot of inroads. So that's a quick overview of the tools that we're seeing in the real-time data stack. Let's look at the architecture here. This is a pretty typical architecture of how the data flows through the modern real-time data stack. You're seeing data being generated. We talked about it through sensors, through users, doing user activity like click streams or even through apps and databases like, say, MongoDB or DynamoDB. All of this is getting fed into one of your streams, could be your event streams or your CDC streams like Apache, Kafka or Kinesis.

Shruti Bhat:

From those streams, it's an optional step if you're building real-time features or ML models, then you have your MLOps and ML pipeline there. If you are not doing any machine learning, it's just analytics application where you just need to join to maybe a stream and another data set. You just go through that directly through your streaming ETL or ELT. From there, it goes into your real-time database in this case Rockset, where you would do your rollups as the data comes in. Do real-time search aggregations and joints as the data comes in, and then eventually feed that back to the user, either through Reverse ETL like Census, Hightouch, Omnata or through data APIs, which is your real-time search monitoring or tracking applications. We also do see visualizations here. The more popular visualizations tend to be things like Grafana and Superset, which are more oriented towards real-time and have alerts and time series visualizations built in.

Julie Mills:

Hey Shurti, I had a couple questions for you before you get into Rockset. Is now a good time?

Shruti Bhat:

Yes.

Julie Mills:

Awesome. So I wanted to know, are there options for CDC besides Debezium?

Shruti Bhat:

Oh, interesting question. So Debezium is an open source tool. It's most commonly used with Postgres and some of the more popular SQL databases out there. But what we're seeing is that the NoSQL databases like DynamoDB, MongoDB have already built CDC streams into their platforms. So Dynamo streams actually from Rockset perspective are one of the top CDC streams coming into Rockset. Dynamo is very interesting. DynamoDB, massively scalable, completely serverless but very limited in its analytics capabilities. So how do you take real-time information from DynamoDB and do things like search aggregations and joints while the data is getting updated? This is where we see Dynamo streams coming in, same for MongoDB. So Mongo Atlas has Mongo streams and those CDC streams are also very popular. As in when the data changes, it basically pushes your updates, no deletes and insert as a change data capture stream. Great question.

Julie Mills:

And then one more for you before we keep going. How do these data APIs look, is this GraphQL or is it something else?

Shruti Bhat:

So data API is, this is a whole topic by itself, I think. What Rockset does on our side, and this is I think becoming more and more of an industry standard. We're seeing other database companies also embrace the same pattern is that, so we've standardized on SQL across the stack. Great. Now, how do applications consume that SQL? What's the most secure and efficient pattern for applications to consume those SQL queries? It's not the best idea to embed SQL inside your application for so many reasons, SQL injection, all the way to, how do you manage changes? So instead the standard layer here is to have data APIs. So what we've done on the Rockset side is you write your SQL statement, which is your search or your aggregation or your real-time join. And then you expose that as a rest endpoint, and that is SQL over rest, which becomes a data API.

Shruti Bhat:

And that is what you hit from your application directly. In cases where GraphQL is in the picture, it's very much compatible to have these SQL data APIs and GraphQL in your stack. Because what is GraphQL doing, GraphQL is sort of enforcing a way for these data APIs to communicate with these applications. And it makes it more clean and streamlined. So data APIs is a more broad way of saying it's just SQL over rest. It's a rest endpoint. And yes, GraphQL would absolutely be an option if you wanted to have GraphQL in your stack managing these data APIs and how it communicates with your application.

Julie Mills:

Awesome. Thank you.

Shruti Bhat:

Great. Hey, great questions. Thank you. Please, keep the questions coming in your chat window and Julie will share them here. So with that, just switching gears a little bit and talking briefly about Rockset which is the real-time analytics database. This is powered by Converge Indexing, and I just wanted to share a quick visual on what does it look like if you zoom in on that portion of the stack, right. We talked about say data streams or CDC streams, or even data coming from some of your batch systems sometimes. What Rockset does is it's the real-time analytics database in the cloud. So it's schemaless ingest data. Again, standardizing on SQL and saying, hey, as the data comes in, you can do real-time data transformations and rollups just using a SQL statement. So you're constantly rolling up your data as it comes in, or you're constantly transforming your data as it comes in just using a SQL statement. So that's the ingest side of it.

Shruti Bhat:

And then the query side of it is this thing that we call the Query Lambda. We just talked about the data APIs. It's basically saying, well, you can just take your SQL query and expose that as a rest endpoint. But what's happening under the hood is we are building this thing called the converged index in real-time. This is a completely new approach. Indexing is not new, of course. Google search, Elasticsearch, indexing has been around. But Converged Indexing is where we say, we don't just do a search index. We also do a column R as well as a row index so that you can do real-time SQL search aggregations in joints without having to define schema or anything on this data. So what does this look like? What are the key takeaways here from a Rockset standpoint is that how we fit into the stack as the database layer, as the serving layer. We're basically saying whether it's coming from your event stream or CDC stream, it's getting transformed.

Shruti Bhat:

Eventually it comes into Rockset as the analytics database. We're saying we enforce low latency at cloud scale. So the new data is credible, fully indexed and credible in within a second, you're getting millisecond latency queries on tens to hundreds of terabytes. We talked about the load data infra ops, this is kind of front and center for our design. The converged index is fully mutable. Very, very important for your real-time data stack. Mutability is something that most batch systems do not have. And this is the reason that most batch systems cannot keep in sync, cannot keep up with your database as the data comes in. And this conversed index being fully mutable means if you have a CDC stream, the updates, that deletes are handled in place. So it always stays in sync with the source. It's always one second behind, say, your Mongo or your Dynamo or your Postgres. And that, again, really reduces your data ops because of this schemaless ingest you don't have to flatten your GSON, you don't have to de-normalized your data.

Shruti Bhat:

You don't have to manage schemas. So the data ops component is really minimized. And then because of the cloud native nature, the infra ops is almost none. It basically scales in the cloud as in when your load increases. And finally, one of the biggest things that we've seen holding people back when we hear the words, "Oh yeah, maybe batch is good enough for my use case." What we're really hearing is, good enough means you're settling. Good enough means you think you can't afford the real-time stack for this. Suddenly with this modern, real-time data stack with all the things that Rockset has done, we're saying, well, you can get a real-time pipeline at the same, or maybe even a lower cost because you can do things like selectively ingesting fields, rolling data up as it arrives, independently scaling computing storage and really optimizing for compute. So certainly that settling for good enough is, no, not good enough. Good example is when I think 10 years ago, if you had told me I would be doing two-day shipping for everything, I would say, no way. For most of the things, it's good enough for me to drive over and pick it up. I'm not going to do two-day shipping.

Shruti Bhat:

Fast forward today, it's become so cheap and so available that, yeah. Hey, I have a two-day shipping for almost everything I order online. So that's kind of the transformational nature. If things are much more easy and accessible, you can get this for a lot of the use cases that you thought it was out of reach for. And with that, I just wanted to show a quick live demo so that you can visualize what this looks like. This is going to be super quick because there's a free trial that you can try if you wanted to. Let me just start with, how you would get into this. Highly encourage you to try this on your own. Again, the beauty of the model stack is that it is all available for self-service. So go ahead, start your free trial. For everybody who references this webinar, we're happy to give you extra trial credits. So go ahead and start your free trial right now. And we will give you extra trial credits if you mention this webinar. But once you start your trial, this is what you'll see. Very, very simple. You ingest your data, start running queries.

Shruti Bhat:

Nothing to download, install, manage. In Rockset, collections and tables are synonymous. So you start by creating a new collection and simply pointing to your data source. Like I said, you pick your CDC stream if it's coming from Mongo or Dynamo. Or you pick your event stream, if it's coming from Kafka, Kinesis or even in some cases data setting in stream. Let's use Kinesis as an example. And I'm going to say, let's just use one of the existing integrations that I have here. Now, the only import that you really need to give is your Kinesis stream name and the region. And immediately you can start transforming it. So I'm going to switch over to show you what that would look like. I have Kinesis stream, which is actually a Twitter stream. And as soon as you give me the Kinesis stream name and where that is available, Rockset has started ingesting that live Twitter feed and presented it to you as a fully typed, fully indexed SQL table. This is actually deeply nested JSON, but it's a fully typed SQL table in real-time.

Shruti Bhat:

We talked about applying a SQL statement to transform your data in real-time as it comes in. So what you typically do is write a quick SQL command. In this case, what I'm doing is I am rolling up the Twitter stream as it comes in, and it'll immediately show you, okay, if you want to apply the SQL statement on ingest, this is why your source data and transform data would look like. So once SQL statement to do real-time transformations and rollups on your data as it comes in and that's it. And then you go ahead and create your collection. Actually, have a collection that I've previously created. So let me just go to that. Right here, which is a really interesting Twitter fire hose. And as you can see, this Twitter stream is now a fully typed SQL table. I can see what fields exist. I can see what's an area, what's in integer, etc. The converge index is completely invisible. It's been built for you underneath. You don't have to define your indexes. You don't have to manage the indexes. It's already been built for you. All you have to do is start writing your SQL queries. So let's run a quick SQL query on this collection. And SQL query I'm going to run here is I'm taking that Twitter stream, which is being transformed and rolled up in real-time.

Shruti Bhat:

I'm going to unnest it to look for ticker symbols. Ticker symbols, meaning, hey, anybody tweeting about PSLA or AAPL right now. So you have to unnest that data in real-time to find these ticker symbols in those tweets. And then I'm going to say, let's join that with NASDAQ data that I got from NASDAQ and that's sitting in S3, very classic use case. In real-time, you want to join some stream with some dataset that you already have. So without doing any data preparation jobs, I literally took the raw Twitter stream and started writing SQL statements on it. And in real-time, I'm getting a live feed that tells me, oh, TSLA happens to be the top tweeted company in the last one day, followed by FB, no surprises here. So this is a live Twitter stream. If you were to go in and tweet TSLA 10 times right now, you'd see this get updated. And we went from raw Twitter data coming from Kinesis joining that with a completely different format sitting in S3 and coming from NASDAQ. And now building a real-time dashboard of which are the top three ticker symbols.

Shruti Bhat:

And once you have this SQL statement, you just save this as a Query Lambda. So there's a save button, save this as a Query Lambda, and that is essentially your data API. So these are some of the Query Lambdas that we've already saved. And if you were to click on this top ticker tweets, you would see you just have a rest endpoint that you can call it, or you can call it from your Python or NOGS. So that is the demo in a nutshell. It's very, very straightforward. And this is the beauty of the modern real-time data stack. We're seeing a lot of use cases. I won't go into all of these, but very popular in construction. One of our customers basically tracks all the cement mixers. So if you see a cement mixer on the streets there's an 80% chance that it's being tracked by Rockset, you need to know where the trucks are in real-time. Another common use case is gaming. So this is eSports real money gaming. How do you do fraud detection? How do you catch fraud while it's happening? While people are placing bets? Ritual is eCommerce.

Shruti Bhat:

So really digital disruption around healthcare and vitamins, and they use this kind of real-time data stream and Rockset to personalize the eCommerce checkout experience. And of course, Bosch, I think we all are familiar with it, but you can absolutely see the use case where if there's a defect on the manufacturing line, you want in real-time catch the defect before thousands of defective tool have been churned out on the manufacturing line. So these are some of the common examples, but we're seeing now certainly these use cases are everywhere. People are demanding this for sales, for marketing. For anything that, in Salesforce or in HubSpot, you want a real-time view of what's happening with your customers. Why can't you have a simple real-time data stack that gives you updates. So I'll pause here. Are there any other questions, Julie?

Julie Mills:

Yes. I had a couple that came in and feel free to continue to chat them in as I'm going through these. So one of them that came in was from somebody that says, "We're using DBT for Snowflake today. Can you elaborate on how DBT plays into real-time?"

Shruti Bhat:

Great question. DBT definitely started with the rise of the analytics engineer, which is started with so started with batch. So Snowflake, Databreaks, where you could batch load your data, run a series of transformations using DBT to automate it, and then have your final data set in your warehouse. So that's the batch world. In the real-time world, it looks somewhat different because you won't be running DBT. You won't do DBT run so many times. You'll probably just do DBT run once. But once you trigger that in real-time, your data is being transformed and you use DBT really to manage these transformations, to make sure that as things change you have a single place where you can control all of this automation. And a great way to see how this would work is we published a blog about DBT in Rockset and how you can use DBT today to manage the SQL transformations as data comes in, or the SQL queries where once the data is being indexed, you can do a series of SQL queries on top of that.

Julie Mills:

Awesome. Thank you. And then I just have one more for you, which is what are the common use case patterns for Reverse ETL?

Shruti Bhat:

Reverse ETL. That's interesting. So this is really driven by the fact that today people are not consuming data statically in reports. They're consuming data in their SaaS apps where they live. So if you think of a real world use case, one of our customers digital classrooms in real-time, the sales team needs to know what kind of activity is happening across the different schools, across the different classrooms. So how can you take the data, the user activity data from their SaaS application, their online classroom tech, and then feed that back into another SaaS application, which is Salesforce. And of course, along the way you want to, say, anonymize some of it, you want to aggregate some of it, because maybe your sales team doesn't need to know exactly which student is doing what, only the teachers need to know all that.

Shruti Bhat:

But your sales team needs to know at an aggregate level, what sort of activity is happening in the classroom or what's happening in the school or what's happening in that district. So how do you feed that back into Salesforce? And that's a classic use case for Reverse ETL where you take the real-time data, create these joints aggregations to sense of that data in real-time and constantly feed that back into something like Salesforce or Slack. So your end users can consume it where they live.

Julie Mills:

Great. Thanks Shurti. Thank you everyone for joining today. We really appreciate it. If you have any questions on Rockset, please feel free to reach out to us. We are happy me to answer them. And you can also, as Shurti mentioned, start a free trial. And we can give you $300 in free credits to get started with Rocket. Just go to rocktt.com. Any last words, Shurti, before we say goodbye for today?

Shruti Bhat:

Thank you so much, Julie, for helping moderate this. The only thing I would say is we really believe that your team should only be bottlenecked by their own creativity and never by their data infra. So once you have the modern real-time data stack implemented, what you really see is a lot more innovation, a lot more flexibility for your team's to iterate and move fast. Now, people think that Facebook like button was built in a day. That is true. But the reason it was built in a day was they could do hundreds of growth experiments in a week because they had the kind of modern real-time data stack that we're talking about today. But of course back then it was implemented as a one-off today. It's something that's available to everybody. So the minute you have something like this, what you really find is a lot more experimentation, a lot more innovation and a lot more dev velocity with your teams moving faster, just because now they can try new things which before they felt they were bottlenecked by their data stack. So I'm hoping that you can unleash this in your organizations. Feel free to reach out to us. We'd love to have a deeper dive if you have more questions.

Julie Mills:

Awesome. Thanks everyone for the time today. Take care.

Shruti Bhat:

Thank you.

Recommended Webinars

Serverless Real-time Indexing: A Low Ops Alternative to Elasticsearch

Scaling MongoDB Best Practices for Sharding, Indexing and Performance Isolation

How Standard Cognition Builds AI-powered Autonomous Checkout on Computer Vision Data

Best Practices for Analyzing Kafka Event Streams

See Rockset in action

Real-time analytics at lightning speed

Request a Demo