The Open Source Path to Security and Privacy: Divvi Up and Let’s Encrypt

In this episode of the Open at Intel podcast, host Katherine Druckman spoke with Sarah Gran and Brandon Pitman from the Internet Security Research Group (ISRG) about their projects, Divvi Up and Let's Encrypt. They discuss the creation and impact of Divvi Up, a privacy-preserving metrics aggregation service, and its role in protecting individual data while providing valuable insights to organizations. They share the journey from collaborating with Google and Apple on COVID-19 exposure notifications to enhancing privacy for Firefox users. The conversation also explores the importance of TLS certificates provided by Let's Encrypt and the challenges and advancements in the realm of online privacy. 

“We want anyone who is running a website that touches the public internet to have a TLS certificate and to know that that is a pretty easy, seamless process today.”

— Sarah Gran, vice president of the Brand and Donor Development team at Internet Security Research Group (ISRG

 

Katherine Druckman: Thank you both for joining me. This is kind of a busy event. We're at All Things Open and nobody has time for anything. I really appreciate you taking a little time out of your schedule to talk to me. If you wouldn't mind, just introduce yourselves really quickly. 

Sarah Gran: Sure. My name is Sarah Gran and I run the brand and donor development team at ISRG, Internet Security Research Group, which is the 501(c)(3) nonprofit behind Divvi Up, our privacy-preserving metric service. But it's also the nonprofit behind Let's Encrypt, which is the Certificate Authority. 

Katherine Druckman: I’m a big fan. 

Sarah Gran: Thank you. I'm wearing my Let's Encrypt pin today, to promote our certificates. Let's Encrypt, today is providing TLS certificates to 480 million websites and we do that for free in an automated fashion. I have been at ISRG since 2016. I've seen a lot of change and I'm excited to talk about our second project, Divvi Up today. 

Brandon Pitman: And I'm Brandon Pitman. I'm a software engineer and tech lead on the Divvi Up project. I've been with the ISRG since about 2021, which is about a year after the Divvi Up project started, and I'd love to talk about it with you today. 

Katherine Druckman: Let's just start there. What is Divvi Up? I know about Let's Encrypt. I definitely want to talk about that because, again, I’m a big fan and anyone who doesn't know about it, needs to know about it. 

Overview of DivviUp

Brandon Pitman: Divvi Up is a privacy preserving metrics aggregation system, and the purpose is to provide aggregate data to organizations that want it without revealing the individual measurements that are being uploaded by clients. This is important because many organizations across the tech industry, for either operational or perhaps business intelligence reasons, need to know the aggregate data that their users are doing. They need to know what their users are doing with their products. They need to know how their products are operating. On the other hand, the tech industry has a large problem with leaking data. Organizations are hacked all the time. 

Katherine Druckman: How many of those emails have we gotten? 

Sarah Gran: Too many. 

Katherine Druckman: You get free LifeLock for a year. Thanks.  

Privacy Concerns and Data Security

Brandon Pitman: I've forgotten how many emails I've gotten from, Have I Been Pwned? The purpose of Divvi Up is to provide a technically enforced guarantee that the data that's being uploaded from client applications on your web browser, on your mobile device, can't be leaked because no entity in the system that's aggregating the data, ever sees the measurements in the clear. 

Katherine Druckman: And so, as an end user, I would imagine this is invisible to me, but what is the incentive for somebody to implement privacy-preserving metrics, given the prevalence of the kind of tracking that we see in the world that people frankly tolerate? So what type of business is going to care enough to take that measure? 

Sarah Gran: That's part of the reason that we started Divvi Up as a public benefit organization. We are here to make the internet more secure and privacy-respecting for those folks who really do not have a lot of knowledge or power or control over whether their data is collected. And we have been ground down by the expectations of just turning over the data that is related to our behaviors. So having a public benefit entity come in and insert ourselves to say, "There is a way for this to be possible for you to have the insights that you need as an app developer without compromising the privacy of the people who you're trying to make a cool app for." So the way that this project started was, back in 2020, we were sitting at home doing our Let's Encrypt stuff, disinfecting our groceries before they came into the house, like everyone else. 

Katherine Druckman: I remember that. Yeah, yeah, yeah. Clorox wipes everywhere. 

Sarah Gran: Yep. And we were approached by Google and Apple about running an early version of this service. They decided to partner to develop an app for COVID-19 exposure notifications. And this was a pretty unusual moment to have these two companies come together and collaborate. They recognized that they could be doing something great for the world by having an app that helps people know if they have been exposed to someone with COVID-19. But also, this was the first time that a lot of people were asked or considered putting an app on their phone that was collecting medically sensitive data. 

And that data was ultimately going to public health authorities. So in the state of Minnesota where I live, the state was running the COVID-19 Exposure app and they wanted to know what the epidemiological change was in terms of people being exposed or getting COVID. And that's an instance where you really just need to know where the ship is headed. You don't need to know on an individual basis, how many exposure notifications you got or I got, or if Brandon got, zero. We just need to know, in general, is there more or less COVID. And that was a really tangible way for us to see what the potential benefit of something like this could be. It made a big difference in the privacy of people who are using the app, to have a more secure and privacy respecting experience while also contributing to a greater good and helping to stop the spread of COVID-19. 

Real-World Applications and Examples

Katherine Druckman: Let's talk about the data. Let's talk about the current landscape, the unfortunate reality that we are in, right, where so much data is collected. And when you think about something like health data, that's something I think most people will agree on, that that is very sensitive, and they don't want it shared. A lot of people are like, "Oh, I like personalized ads."  

But then when you start thinking about, well, politics, health, finance, certain more sensitive areas, then I think people become more concerned. But I think the reality today is that so much of this can be de-anonymized and there are so many unforeseen consequences of all of the amount of data that's collected. I can think of several examples. I always talk about the Strava app. You know the running app, Strava? 

Sarah Gran: Yes. 

Katherine Druckman: Are you familiar with what happened when they released all of their running path data, their user data on where people were running? You know that story? 

Brandon Pitman: Oh, yes. 

Sarah Gran: Please tell. 

Katherine Druckman: Well, they released all the data, but innocently thinking, "Oh, this would be cool because people can discover new running routes and jog around a park or whatever that you didn't know about." Well, they released this map of data and then suddenly, there's this hot spot with a clear running track out in the middle of the desert. Why would there be a running track right in the middle of the desert? Oh, because it's underground and it's a secret military facility.  

There was something in the New York Times a few years ago showing how easy it is to de-anonymize data that's correlated with cell phone location data. Then there you can track any staff member of, say, the White House, or somebody who doesn't want to be tracked. So I wondered if we could just talk a little bit more about those harms. I don't think people understand where the harm is in all of this data being collected. 

Sarah Gran: Yeah, I think our partnership with Mozilla is a good example of being really thoughtful about data that could be innocuous on the surface but actually have a pretty big impact in terms of identifying an individual person. So Mozilla was one of our early partners in experimenting with Divvi Up and privacy preserving metrics for their Firefox browser. They wanted to know how things were breaking in Firefox based on different pages. For example, websites update their code all the time and sometimes that triggers a bug in Firefox. And they really wanted to understand the sources of those bugs. But the way to do that would be to understand the exact websites that individual users were visiting. That's obviously problematic if you just mentally scroll back to the 20 pages you've looked at yourself, it provides a pretty clear picture about you as an individual. 

Katherine Druckman: Like a fingerprint? 

Sarah Gran: Yes. And so, that was a real rock and a hard spot for Firefox, and Divvi Up enabled them to be able to understand the most visited pages without knowing exactly which users were going to those pagings. 

Katherine Druckman: From an engineering perspective, talk to us a little bit more about that. 

Technical Details and Protocols

Brandon Pitman: Sure. We've looked at a number of approaches for how to generate this data. The aggregation system that we use, strongly decorrelates each individual page that's visited, while also only producing the most common domains that are visited. So if I visited my personal website, for example, unless I visited many millions of times, it is extremely unlikely that my website would be represented even if it triggered a crash in Firefox. On the other hand, if I visit the New York Times, many millions of people are presumably visiting that website. And so, it will show up in the most common crashes. So our approach is twofold. Not only do we decorrelate the individual measurement from the user that generated it, but we also only represent the most common domains that appear in the data. 

Users can’t be tracked based on their browsing patterns. There’s no way to link “these 20 websites” back to a specific person, like John Smith in a particular town. But we also only represent the most common domains because in the worst case, a single domain could strongly correlate to one or a small group of people. 

Katherine Druckman: Right. 

Brandon Pitman: If it's a domain that indicates, for example, some small organization, a personal website, something like that. 

Katherine Druckman: Interesting. How does your work balance between the two projects: Let's Encrypt versus Divvi Up? 

Brandon Pitman: There are two separate engineering teams. We have some cross-pollination in terms of ideas, but for the most part, the engineering work is sort of split between the two teams. We do have a lot of cross-pollination in the non-engineering team. So the communication and the finance team, the fundraising teams are sort of sharing the workload between Let's Encrypt and Divvi Up. 

Sarah Gran: One of the cool areas of cross-pollination between the teams is the work that we have done in the IETF. Let's Encrypt went through the standardization process for the ACME protocol, which is the protocol that underlies Let's Encrypt. And we spent years working on that in the IETF to get it to a place where it was finalized. And the Divvi Up team has spent four years now, I think, working- 

Brandon Pitman: About three years, yeah. 

Sarah Gran: Three years, working on some protocols in the IETF. And it's been fun to watch the engineers go through the process of saying, "How long is it going to take for us to move from Birds of a Feather meeting into an actual working group?” Or “How can we come up with a clever acronym for the working group name?" I think that’s been a good source of informal institutional knowledge sharing amongst the teams. 

Brandon Pitman: Working at the IETF, even just understanding the process is very valuable to remove friction and keep things moving. And especially in the standards body organization where the work can move rather slowly, 

Katherine Druckman: Let's talk a little bit more about incentivizing people to use, both of them for that matter, Divvi Up versus Let's Encrypt. What do you want people to know about why it's so important? 

Sarah Gran: When it comes to Let's Encrypt, we want anyone who is running a website that touches the public internet to have a TLS certificate and to know that that is a pretty easy seamless process today. When we started Let's Encrypt, encrypted page loads were at about 39%. So about 39% of the time, you would visit a page that had a TLS certificate. And today, that globally is over 83% and it's higher in the US. We're in a much better place 10 years almost after Let's Encrypt started, but there's still websites out there that are in need of a TLS certificate. So help those folks get across the line. 

When it comes to Divvi Up, the people that we're really talking to are the app developers. Whether you have an app that you are putting in the iPhone or the Android app store or you're a browser, we want you to really think about the telemetry that you're collecting and assess how good you would feel if that telemetry made it out into the public. And know that there's a lot of work being done to help make telemetry or metrics collection easier and better. 

Katherine Druckman: I would guess that it removes a little bit of liability, right? If I'm an app developer, I'm going to feel better about not being in a scenario where I lose the trust of my users because we've accidentally done something wrong and leaked data. 

Sarah Gran: Yes. 

Brandon Pitman: Yeah. One of the primary motivations for Divvi Up is that we can provide a technically enforced guarantee that the data that's being collected simply can't be leaked because it's never held. And we're very hopeful that users will see that and say, "Oh. Now I'm more comfortable sharing something that I consider sensitive." 

Katherine Druckman: Yeah, I certainly would. Yeah, yeah, yeah. 

Sarah Gran: One of our early subscribers is a human rights defender nonprofit called Horizontal, and they are in a situation where the folks who are using their apps are recording human rights violations. 

Katherine Druckman: Quite sensitive, yeah. 

Sarah Gran: Quite sensitive. And also, the data itself is sensitive, but also the originating source, whoever snapped that picture or recorded the file or the audio recording, could have their life literally in danger if the data is exposed. And so, Horizontal was very motivated to try to better understand how their applications were being used and how to make them better without having that existential risk. So I think that's an extreme case of data that could be really sensitive, but it does help you start thinking about all of the data that is being collected around us and how a lot of it feels kind of icky to have other people know about. 

Katherine Druckman: I have said many times that having access to the data on someone's personal mobile device is the next best thing to tapping directly into their brain. You know everything they're thinking about, you know everywhere they've been, you see all of their photos. It's a treasure trove of sensitive information.  

I appreciate the work that you do, but I also understand the uphill battle in raising the awareness of how important it is. Do you feel like it's getting easier to do that? Are people more aware of privacy risks in the technology they use? 

Brandon Pitman: I think there's a growing awareness. As you mentioned earlier, people are realizing: "Oh, my data is going to leak. I've received so much identity theft monitoring. I have 10 years of identity theft monitoring." But people have come to accept it. And I think what we need to push for is to make a better world where people don't accept it. Where people think that, "Oh, an organization shouldn't leak my data. It's not a given that if I enter my data into this form, it will eventually be in the hands of hackers or criminals or whoever." And tools like Divvi Up, I think, are an important step along the way because until recently, there simply were no tools available that could provide guarantees like that. 

Sarah Gran: One of the things that I think is interesting about where we are as an industry in terms of advancing privacy is, growing understanding that it is really more about layering on privacy protections. In the last year, we have started using Divvi Up with differential privacy. When differential privacy hit the scene a few years ago, there was a lot of initial excitement about it and then there was a little bit of skepticism about it, and it wasn't a silver bullet. But when you layer differential privacy with Divvi Up, then it's even better. We started using Divvi Up with Oblivious HTTP. Oblivious HTTP isn't going to be a silver bullet, but when you put them together, they're even better. 

Katherine Druckman: Right. Layers of protection. 

Sarah Gran: That makes me feel optimistic about the future, that we can continue to build on top of the learning that has happened and the technological developments to continue to make things better. 

Katherine Druckman: Yeah. Well, this is fantastic. Thank you both. Is there anything that you really wanted to talk about that we haven't gotten to yet? 

Brandon Pitman: I think it's interesting that we're working in the open source sphere. 

Katherine Druckman: Oh, yes. 

Open Source and Community Involvement

Brandon Pitman: All of this work that we've talked about is open source. We have an open source implementation of the distributed aggregation protocol, which is the technology that underlies Divvi Up, so anyone can pick this up and use it. In fact, there's a startup that has sort of forked our aggregator and is building up on top of it. It's been very interesting to work in the open source arena and create tools that anyone can use to secure their data. 

Sarah Gran: Yeah, and if you're out there listening to this podcast and you're curious and you're digging around in GitHub repo, go to our website and check it out. We would love to get feedback or input. 

Katherine Druckman: Awesome. Well, thank you both so much. I really appreciate it. I've learned some things and I hope everyone else has too. 

Sarah Gran: Thank you. 

Brandon Pitman: Thank you. 

Katherine Druckman: You've been listening to Open at Intel. Be sure to check out more about Intel’s work in the open source community at Open.Intel, on X, or on LinkedIn. We hope you join us again next time to geek out about open source.  

About the Guests

Sarah Gran, Vice President, Internet Security Research Group (ISRG

Sarah Gran is the vice president of the Brand and Donor Development team at Internet Security Research Group (ISRG), the nonprofit entity behind Let's Encrypt, the world's largest certificate authority. Sarah joined ISRG in early 2016, shortly after the Let’s Encrypt launch and has helped it become a household name in software development. Sarah has also helped to shape ISRG’s latest projects, one focused on bringing memory-safe code to security-sensitive software, called Prossimo, anda privacy-respecting metrics service, called Divvi Up. Sarah is an independent member of the Tor Project’s Board of Directors. Previously, Sarah worked as a Vice President at Edelman SF and Deutsch NY in brand and communications strategy groups. 

 

Brandon Pitman, Senior Software Engineer, Divvi Up 

Brandon Pitman is the senior software engineer for Divvi Up and has a master’s in computer science from Georgia Tech. Prior to ISRG, they worked at Google on a variety of Security, Privacy, and Green Energy projects. Brandon came to ISRG to be a part of improving the privacy stance of the Internet as a whole. 

About the Host

Katherine Druckman, Open Source Security Evangelist, Intel  

Katherine Druckman, an Intel open source security evangelist, hosts the podcasts Open at Intel, Reality 2.0, and FLOSS Weekly. A security and privacy advocate, software engineer, and former digital director of Linux Journal, she's a long-time champion of open source and open standards. She is a software engineer and content creator with over a decade of experience in engineering, content strategy, product management, user experience, and technology evangelism. Find her on LinkedIn