Microsoft Community Insights
Welcome to the Microsoft Community Insights Podcast, where we explore the world of Microsoft Technologies. Interview experts in the field to share insights, stories, and experiences in the cloud.
if you would like to watch the video version you can watch it on YouTube below
https://youtube.com/playlist?list=PLHohm6w4Gzi6KH8FqhIaUN-dbqAPT2wCX&si=BFaJa4LuAsPa2bfH
Hope you enjoy it
Microsoft Community Insights
Episode 19 - Architecture for High Availability and Low Latency with Florian Lenz
Learn the secrets to building an ultra-responsive e-commerce platform with our special guest, Florian Linz, a freelance solution architect with a wealth of experience in Azure Cloud projects. Together, we unravel the intricacies of Azure's high availability features, such as geo-redundancy and zone redundancy, that ensure service stability across data centers. Florian shares invaluable insights into leveraging Azure Front Door to enhance website performance through effective content delivery networks, providing practical guidance on choosing the right resources for load redundancy.
Hello, good morning, good afternoon wherever you are. Welcome to Microsoft Community Insight Podcast, where we share insights from community experts to stay up to date in Microsoft. I'm Nicholas and I'll be your host today. In this episode, we will dive into Azure Cloud Architecture for high availability and low latency. That's a long one, and before we get started, I want to remind you to follow us on social media so you never miss an episode, to help us reach more amazing people like yourselves, and today we have a special guest called Florian Linz. Can you please introduce yourself please?
Speaker 2:Sure, thanks. Yeah, I'm Florian. I'm working as a solution architect, as a freelance consultant. I help my clients to make their cloud architecture better and more stable, and so this is why, as well as this, topic of cloud architecture for high availability and low latency came in place. And yeah, today I want to talk with you about this topic and yeah, let's see how we go further.
Speaker 1:Okay, so as a freelancer, what kind of projects have you been involved with for your clients?
Speaker 2:Well, it's various projects so it depends highly on the customer, but at the end, as I told, I'm working as an Azure consultant, so it's basically always any relation to Azure. One project is, for example, currently creating an Azure landing zone for a customer, or they already created this landing zone concept and now my part is to bring some subscriptions into the landing zone, help them to architect the whole subscription and the resources. And other projects are some less architecting and more programming. So I'm I started as a programmer a lot years ago and, yeah, now I'm more architecting but still programming, and in such such a projects I work as well as a programmer and then just doing doing programming stuff.
Speaker 1:Yeah, is that? Is it your own? Is it just yourself in that project Do? You have any colleagues, if it's a bigger project.
Speaker 2:It depends. I was on one project. It was only myself. It was quite okay, but at the end it's a little bit lonely when you work all day long only for yourself. Maybe there's a project manager where you give some information and that's it. And most of the time I'm working in projects with multiple people, but honestly I'm as an external one. I don't work in the company, but I work with people from the company and, yeah, helping these people to get the best out of it.
Speaker 1:Yeah, it's similar like a contractor really, you're not officially employed, but you're employed by them, the client. So in today's theme we're discussing high availability and low latency. So can you give us some use cases in your project that you've worked on that have high availability?
Speaker 2:Yeah, I mean basically high availability is or at least in Azure there are multiple concepts of high availability. I mean there is for sure the high availability Everyone knows of the gear redundancy, globally high availability. So basically, um, before all the cloud stuff, when you have a on-premise server and you run it in for example you in uk or me in germany um, then it's for sure, when this server is not running anymore, we have a problem, because then our, our solution is not working. And then with gear redundancy we can just add one server to our server farm, for example in the USA, and then we have some kind of high availability. So that means when, for example, in Germany, my server goes down, there's still a replication off in the USA. So everything's working fine, more or less, but then the traffic goes to the USA instead of Germany. Everything's working fine, more or less, but yeah, then the traffic goes to the USA instead of Germany and this is it.
Speaker 2:But in Azure we have not only this gear redundancy, we have as well zone redundancy and local redundancy. And this means, for example, that in Azure we have, when you have some, for example, a storage account, you can choose. You can choose local redundancy, um, zone redundancy and gear redundancy. Uh, I just don't talk about read only access, but this is we have and basically gear redundancy as I told, it's globally replicated. Then we have zone redundancy, which means when I select, for example, example, west Europe I guess West Europe, the region is in the Netherlands Amsterdam region Then this means that in this region there are multiple data centers from Azure and this resource is replicated through these centers, which means when one center, for example, has an outage or something else, there are two other centers which are still running, hopefully, and then it's regional redundant and then it's local redundancy, which means there is only one data center and inside these data centers I have multiple instances.
Speaker 2:I guess it's all the time three. Inside this data center, I have multiple instances, I guess it's all the time three. And yeah, basically this is one server rack and when one server rack goes down, then there are two other ones. But when the whole data center goes down because of an outage, for example, yeah, then with local redundancy I don't have any running instances, okay.
Speaker 1:Is there any particular resources that you would recommend? Like to do load redundancy, because I know you can do like app gateway, traffic manager or, if you have like, scale yeah, I mean, this depends on on your needs.
Speaker 2:Um, there is a quite good um, yeah, quite good description on Microsoft Learn, but basically, when you have a website, for example for low latency, you can use Azure Front Door. So it's a content delivery network and basically, by using this, you have all the advantages of a content delivery network. It's quite fast and, yeah, when you work with Azure, I mean, it's quite easy to use Azure Front Door.
Speaker 1:Okay and yeah, when you work with azure, I mean, it's quite easy to use azure photo. Okay, so which which one do you use often to to replicate, like whether it's like if it were like a vm, like a architecture? Which one would you recommend? Would you recommend for high availability?
Speaker 2:I mean at the end. I mean there are four. It's Azure Front Door, azure App Gateway, api Gateway, I don't know. There are four instances you can use and basically, it highly depends on your requirements. So, basically, most of the time, I'm using Azure Front Door instances you can use and basically, it highly depends on your requirements. Um, so, basically, um, most of the time I'm using azure front door, um, this is is a. On the one hand, it's because it's a content delivery network and for websites, which is most of the time what I'm, what I develop or what, uh, what is in my infrastructure. Um, this is more or less the best solution and, furthermore, in azure fronto you have um the web application firewall, so you have some basic security, um sql injection, um or to to check if there's a sql injection, your requests, and this is yeah, basically out of the box from azure front door, which is when I have a web application. Most of the time I'm using Azure Front Door when it's global redundancy.
Speaker 1:Okay, so that's it. I'm curious to know, since you have freelance, do you use infrastructure code like Bicep Terraform?
Speaker 2:Yeah, sure.
Speaker 1:You're like architecture.
Speaker 2:Yeah, I was in a project and I really like it that there was a project manager and he told all the time we want to automate everything and we don't want to do anything manually, because otherwise you do it, for example, in the development environment, and then after some time you don't know what did you do? Why is production not working whatever? And so we started with infrastructure as code, and for me most of the time it's either arm templates or bicep. Um, yeah, depends a little on on what I what I'm doing. Um, but terraform. Honestly, I don't really work with terraform. For me there's some kind of abstraction I do not really like, because at the end, I'm an Azure architect, I work in Azure. I don't want to go to Google Cloud Platform or AWS, for example. So for me, terraform. The advantages of Terraform are not that big and the disadvantages are quite bigger, so I do not use it.
Speaker 1:Yeah, otherwise you could go into Bicep because they're fully Microsoft itself. So, aside from, I know monitoring is a key aspect of high availability. What other metrics are you used to monitor high availability and low latency?
Speaker 2:Basically as well.
Speaker 2:It highly depends. But, for example, what I know is when you have high availability, or at least you want to have a high available system, you have always the problem at least you want to have a high-availability system you have always the problem that there's a requirement that there is no downtime allowed. I would say so you have SLA, for example, that 99.99%, you guarantee that your software is working, for example, and then you want to have high availability in your system. And when you want to have, yeah, high availability in your system, and when you want to have this, you need to have a look to, for example, azure resources and have a look as well to the slas from microsoft. So, for example, cosmos db, I think it's 99.999 percent. Uh, no, it's 4 nines, so 99.99%. And when you want 5 nines, for example, you'll need to go to a higher available solution or use a different SKU. So, basically, this is one metric, for example, that when you have some kind of service level agreement that you need to check the agreements from Microsoft or other resources you are using.
Speaker 1:MARTIN SPLITTINGER OK, yeah, yeah, because I think I myself use just built-in Azure Monitor metrics to monitor the resource, whether it's traffic manager, app gateway, and just create some alerts and new things. Yeah, I think before you mentioned, you got some slides, marissa. I have some slides.
Speaker 2:Yeah, I have some slides.
Speaker 1:Yeah, we can show it if you want. Yeah, because we got quite a few viewers.
Speaker 2:Yeah give me a second.
Speaker 1:I see you got cloud light.
Speaker 2:Yeah, there's a lightning inside the cloud, so it's all. Honestly, it should be some kind of Azure serverless Because it's probably like a function.
Speaker 1:Yeah, it's like a function.
Speaker 2:Yeah, exactly this is the idea.
Speaker 1:Function in the cloud.
Speaker 2:So, present.
Speaker 1:Yeah, if you share your screen, I can just put it up on your screen here.
Speaker 2:Yeah, I share my screen.
Speaker 1:Yeah.
Speaker 2:Perfect, yeah, so basically, exactly um. One interesting um fact I found in the internet is from google, and google found out some years ago that when your loading time increases, that as well your bounce rate is increasing. And from zero to one second, for example, uh, it's, yeah, not really measurable that, yeah, there is a bounce rate. So, for sure, when you are, when we have, for example, 900 milliseconds loading time, it's not as good as when you have, for example, 300, but up to one second it's more or less okay. But from one to three seconds it's on as well, it's already increasing and it's more or less okay. But from one to three seconds it's on as well, it's already increasing and it's highly increasing when you have a loading time more than three seconds.
Speaker 1:Sorry, what's bounce rate?
Speaker 2:Okay, bounce rate is the number of people who leave the site.
Speaker 2:So you go to the site. It's loading one second, two seconds. Okay, it takes too much seconds, I close it, I don't care about it's, it's too long. So this is a bounce rate and, for example, when you have, I don't know, in general, you have three people who leaves the page because it's too slow, when the load time is up to oh, yeah, it's one to three seconds, then the bounce rate increases by 32, which means when, in general, you have, for example, three people, um, going back from from your site when it's too too slow, then when it's more than one second and up to three seconds, it's not three, it's four. And when you have, for example, now, 100 or 300 people leaving your page, then it's 100 000 people more, only because your page is too slow.
Speaker 2:And this could be one thing. When you have, for example, this is not directly something with high availability, but something with low latency we go, um, yeah, maybe afterwards as well, um, but yeah, this is some interesting page, I think, and some some some information I got from google, yeah, which is quite impressive, I guess. For example, as well, when you have a page, a webshop, for example, where you get money from. So it's a different idea when you have I, I don't know just a web page to show some information or a blog, for example, it's not nice, but it's okay, I guess. But when it's a page like Amazon, yeah, this is money which you lose because only you have a slow web page. And, as I already told Amazon, yeah, for Amazon, we have, for example, the following I got the information I, I guess, three weeks ago um, this numbers is for are from 2014 and I guess 2018 or 19. There are newer numbers, um, which means that amazon goes down and loses 66 000 per minute, and I guess they were down half an hour, which means they lost more or less five million dollars or something around. Um, yeah, which is a quite high number, a high number for sure.
Speaker 2:And here we come to the topic of high availability, because, at the end, amazon is a web page where you can, yeah, shop, go shopping and buy stuff. You pay more or less on amazoncom, and when the page is not available, you or Amazon lose money. And here's the same like with low latency. When you have a blog, maybe high availability is nice, but it's not really recommended or maybe the advantages of high availability is not really for you when you have a blog, but when you get money from your website, maybe it's a more more required um solution. You want to. You want to have that. Yeah, you have high availability, um. So basically, these are two facts.
Speaker 2:I found um which I think, based on the requirements and the solution you want to create, high availability and low latency could be something for you. And basically, when we have a look, what does it mean? Low latency? Low latency means we have, for example, as I told you at the beginning, we have, for example, a web page it doesn't matter if it's hosted an app service plan in a vm. However, when there is a server in germany, for example, and the customer from the usa talks to my page I have hosted in germany, then there is around 200 to 300 milliseconds of latency, just because the whole traffic needs to go from the USA under the Atlantic Ocean to Germany and back, and this is 200 to 300 milliseconds. We just lose because there is no server close to our customer as well. A block, not a problem. Amazon, for example, should create instances quite near to their customers Because at the end bounce rate as well 300 milliseconds, then you only have 700 left.
Speaker 2:When you have a server close to the customer, maybe there's a latency of 50, maybe 100 milliseconds.
Speaker 2:Then you have a little bit more time to load the data. So for high availability we have, for example, the following you have, as I told as well in the beginning, we have multiple instances around the world. And what happened now? When a customer from CSA wants to go to our website, thensa wants to go to our website, then yeah, he wants to go to the website. The website is not available, so he will get data from the instance from germany, for example, then for sure, we have the problem of higher latency, but at the end maybe it's better to have a higher latency but still get data instead of having low latency and when this low latency server is not available, we just don't get any data. So maybe, or we have this wonderful world where everything is working with high availability, so available system and low latency, because the server is near to the customer. And in the not so nice world we have, yeah, not not available system, but then a higher, higher latency, but we still have data yes, sorry.
Speaker 1:so what happens if you have like one zone and one data center near you and you can't go elsewhere because data sovereignty? If you're in a public sector, what option would you choose? Would you choose like backup and stuff with scaling, because you can still use high availability and just like between two other resources in one zone in a way?
Speaker 2:So in one zone basically, yeah, one zone is spent. I don't have the correct numbers, but some hundred kilometers I would say. So it's quite near to the other center, so this is not a problem. I guess the bigger problem is really when you have yeah data centers around the world, then it's maybe a bigger problem.
Speaker 1:Okay, and these are all the data centers that you worked on.
Speaker 2:No this is just a picture. I wrote some data centers At the end. Most of the time as well, it's highly dependent on the requirements. For example, when you create a solution and you have some numbers from your company and the company say, hey, we have 50% of our customers live in USA, 20% live in Europe, I don't know, 30% in Australia, maybe it's based on these numbers, you can create your solution and, honestly, then you don't need to 30% in Australia.
Speaker 2:Maybe it's based on these numbers, you can create your solution and honestly, then you don't need to have a solution in South America, for example, but for sure you can as well add these regions, for example, as a backup, because maybe when you add in Mexico one data center, maybe it's not called very often because in best case all the US instances are working. But when the US instances are not working, maybe the Mexico instance is working and then you have to back up to Mexico, which is, I guess, quite nearer than Europe. So this is not a problem then to have the lower latency for Mexico and this could be a solution. But here at the end it depends on the requirements and the numbers from your company yeah, even if it's like financial company, like trading they want.
Speaker 1:It's like zero latency, not point something, so that when you have can only do it like whether it's through on data. It just depends on the client itself really.
Speaker 2:Yeah, yeah, basically I have some. Now we can go to the serverless topic, but I guess this is not a good idea. Maybe some other time, but yeah, at the end the talk is a presentation I held on conferences and in this talk I go further into the topic of why serverless could be a good solution for such a high availability, global, redundant high availability system. Because of the payments you need to pay, so the cost of your solution. When you have VMs, it's basically no matter how much traffic there is on the virtual machine, you need to pay for it. In serverless you pay only for the request and I take some numbers and ideas and I guess it's a quite good presentation, but I guess for today it's quite too long.
Speaker 1:So would you recommend serverless instead of like past service, instead of like a vm itself, because vm is just on primaries, it costs quite a lot. So you need people can have an option to move serverless or a pass like using app service container access.
Speaker 2:Yeah, for me, for example I mean, I'm a serverless enthusiast, I would say so I really like using serverless. I have a YouTube channel about serverless and for me it's a quite nice and good solution. But, as well, here the requirements are highly dependent. So when you have, for example, or the advantage of serverless, you pay only for the requests. You do so. In best case, one request I guess 1 million requests costs you $4. And I don't know best case with 1 million requests you get $10. So you make a profit.
Speaker 2:In such a case it's not a problem as well when you have when the load on your system is not predictable. In such a case as well, serverless could be a better solution because of the automatic scalability. You don't need to have a lot of the infrastructure etc. You can just focus on programming, doing your job and that's it. There are some advantages but honestly, there are as well disadvantages and, for example, one disadvantage when you have a look to serverless, for example, you have the Azure function, you host it, you can host it all over the world, in every region, and it costs you exactly the same as, or every request costs the exact same when you have it only in one region or in 100 regions, for example.
Speaker 2:So this is quite nice, but on the other hand, you need a Cosmos DB. The Cosmos DB needs to be globally replicated. Maybe you need as well multiple write regions, because otherwise you have the latency later on in the database. And these are some disadvantages you need to have in mind because, for example, with a VM, you have the disadvantage of it's not so scalable. You pay before or you pay a fixed number, but in the vm, for example, you can host your application, you can host the database and you don't have any problems or smaller problems. Basically, with um, yeah, having as well a database instance, for sure, then you need maybe some, some logic to replicate all the data, but at the end you have, yeah, you have the advantage at the disadvantage of, disadvantage of um, yeah, take care of the whole infrastructure with all the pros and cons.
Speaker 1:Yeah, plus, serverless is quite cheaper compared to cost, compared to VM itself. It would save you quite a lot of money as well. How would companies optimize cost for high viability? So how would people scale it? Like you mentioned serverless and so Cosmos DB you just mentioned as well, if someone wanted to do like FinOps in high scale, high availability, like architecture, how would people, someone optimize costs as low as possible?
Speaker 2:I would pay to someone, optimized cost, as low as possible. Yeah, at the beginning, what I always see as a consultant when I go to a new project, most of the cases are, first of all, over-provisioning. So you have, for example, an Azure Function app and you guess, okay, it's fine, they're using serverless Azure Functions, they have in mind some kind of serverless first mindset. And then you go to the serverless Azure function app and you see premium plan and you think, oh God, so they pay thousands of zeros per month. And okay, wow. And then you think, okay, it's fine, let's go to the request. And then you see, for example, two thousand requests per month and then I guess, okay, they pay more or less one zero per request. This is not a good solution.
Speaker 2:I would say, um, and in such a case, for me it's always like don't over provision, start small, think of what you really need and maybe, um, yeah, have a, have a, some kind of plan b, when plan a is working or it's going further and it's gotten bigger, that you then exactly know what you need to do to get these new numbers as well.
Speaker 2:So, basically, you start with a consumption plan, or I guess it's called flex consumption plan. Now Azure introduced it in preview, but in general it's a better consumption plan. So it's an idea of pay-as-you-go in an Azure Function environment and with this flex consumption plan you have the advantages of VNet integration. I guess cold starts are more or less reduced. So I guess starting in a company or an enterprise environment, starting with a flex consumption plan, is, for me, I would say, the best starting point, because at the end you can go to premium plans whatever after some time, or you start with Azure container apps, so basically a Docker container which runs serverless, pay as you go, and when you don't want to do true bundled serverless, you can just use your own Kubernetes cluster. So this is as well a quite good solution.
Speaker 1:Okay, brilliant. As this episode is coming to an end, is there any last-minute resources or tips you would recommend other people to learn or be aware of when they want to learn more about availability and the latency?
Speaker 2:At the end. Yeah, microsoft Architecture Center is a quite good learning platform, I would say, where you can see architecture styles and learn from them. I told you at the beginning I used it sometime, but it's some time ago now as well, but honestly, this is a quite good solution to find different architecture styles, found solutions and, at the end, for sure, learn from these solutions for your own work. Basically, architecture center. That's all done.
Speaker 1:In a way, you need to keep monitor and scale, scale, know when to scale, in order to minimize costs as well. Okay, so, so, as is there any? Are you going to any events? Tech events? So, if sorry, are you going to any events? Tech events?
Speaker 2:Yeah, now in October I don't have the correct numbers in mind there is a serverless architecture conference in Berlin where I have a talk exactly about this topic and this presentation where I go deeper into the topic with Azure Fronto as well Azure Functions, cosmos DB and make a small demo of how we can use this solution. I go deeper into the topic with Azure Frontal as well Azure Functions, cosmos DB and make a small demo of how we can use this solution for high availability and low latency. And then there are a lot more conferences, I guess, for this year.
Speaker 1:Where is Okay? Is that in person? I take it that conference you're going to.
Speaker 2:It's in person, yeah, right, okay, when is it, sir? End of october, I guess 23rd, 24th, I don't know exactly that's brilliant, okay.
Speaker 1:Is there any last minute like tips that you want to advise people or resource that you want aside from architecture center, or any tips that people want to be aware of when designing a low-light, high availability architect? Oh, good question. Uh, any like challenges or any? Any pointers would you recommend, like someone that's to be aware of, whether it's like some challenge that you've made in your journey?
Speaker 2:yeah, one big topic. Um, when you, when you have low latency on your web page or web application, so it doesn't mean that your database is low latency. So when you make a high availability, low latency system, have a look to your database as well, because when you have your database only in one region, then it's nice that your web application is low latency, but then the call is not from the customer goes to the web page, but from the web page to the servers going through the Atlantic Ocean that you don't have, yeah, win anything. So have in mind database is important as well.
Speaker 1:Okay, All right, okay, so thanks for joining this episode, florian, so in a few weeks you're going to be on the music Spotify and be live with O2 Media, so thanks, thanks for joining this episode. Bye, thank you for joining this episode Bye, thank you, bye-bye.