all right thank you everyone for joining today and uh if you haven't done so yet I opened the poll so please answer the poll so we know what our audience looks like and this is the third webinar organized by science I my name is akane Abbasi I'm a graduate student at Purdue University and also a member of science I supporting team science is an international web platform to empower research and enrich diversity of biodiversity sciences and we have been hosting this series of webinars to boost conversation about diversity equity and inclusion in biodiversity and Forest sciences and we are developing developing a white paper out of this webinar Series so I just posted on chat but if you're interested in joining please go to the link in the chat uh I believe you can see the chat hopefully now um so if you can leave your name and email address there uh you're all welcome to join the white paper also please remember to mute yourself during the talk uh but feel free to open your video we love to see your faces and we will have a q a session at the end of the talk so uh feel free to post your questions in the chat or if you prefer to speak you can also raise your hands during the Q a session and you can unmute and ask the question yourself so let's see how the poll looks like I guess most of the people have already answered the poll so I will finish it and share with you all right so we have a big number of audience today uh almost 30 people more than 30 people joining thank you for joining we have audience from all the continents pretty much majority from Europe Africa Asia North America Latin America and uh four percent from Oceania that is great and moving on to the affiliation we have I guess more than half of the people are joining from University and Academia Research Center but there are also people from the government officials and individuals as well and your gender um female male I guess this is a good distribution considering that our science is still a little bit male dominated um So yeah thank you very much for sharing uh where you're joining from and your affiliation [Music] uh so first I'd like to introduce Dr jinjin Liang he is physically in the Purdue meeting room uh Dr Liang is an associate professor at Purdue University and also the founder of science I and the coordinator coordinator of global Forest biodiversity initiative gfbi and he will give a brief opening remark before the talk so Dr Liang please take over hello everyone can you hear me great welcome to science I and this is our third webinar series and uh sorry for the uh a little bit of technical difficulties in the beginning of today but it is my great pleasure to welcome every one of you and so first of all before we start I'd like to ask you to please switch on your camera so that we can take a group photo please so if everyone can switch on your camera and uh or cardiac and uh Angela could you uh close the spotlight so that everyone can be on the screen for the first time for the for the same time so we can take a group photo all together but is there a way to show everyone Connie are you able to capture everyone uh just go to you and the show and the select um the great view so that everyone is in the screen okay uh I have two pages there are so many people okay yeah I can to take pictures yes think about what screenshot for it all right everyone smile then first one and the second one three two one all right thank you everyone thanks thank you very much everyone and there's no one uh before we start I'd like to uh remind everyone another thing is because we are writing together this white paper and so if you would please leave us your name and the contact information and then we will be in contact with you so that we can develop this white paper together so that's a another thing I would like to remind us so um today it is my great honor to introduce our third guest speaker I mean a doctor uh Paddy Carroll so um caddy has a bachelor's degree in physics and with a minor in mathematics and chemistry from Siena College and after that she had a master's degree and also a PhD degree in astronomy from the University of Washington so so she's now a senior sustainability strategist and applied scientist with Amazon web services okay and without further review Paddy is all yours great thank you for the introduction um it's great to see such a diverse audience from so many different regions around the globe and to be quite honest I'm a little bit intimidated by the density of scientists and researchers in the audience it's been um quite a while since I was speaking to a science-focused community um so let me go ahead and share my screen oh it says host disabled participant screen sharing can you try it again I just change the setting and let's see if we can make presenter view work yep I see it great um so today I'm going to be telling you a bit about cloud computing and more specifically AWS uh cloud computing and applications for science broadly um and with some case studies and biodiversity and nature-based solutions um I can tell you a little bit more about my background my degrees are in physics and astronomy um I got my PhD in 2016 from the University of Washington um but during my PhD it was about the time that the field of data science was emerging um and I was excited about using technology and using data to make decisions and to uncover new insights and that's where my passion was that combined with a draw out of Academia actually somewhat coincidentally landed me at the Climate Corporation which was an agriculture technology company that was dedicated to leveraging data at scale iot sensor data satellite imagery weather climate models to help farmers make more informed decisions um so as with Climate Corporation for a little over five years and during that time is when I became familiar with Agriculture and agroecology I worked very closely with domain Specialists science scientists and my title over that time changed but the uh the scope of my work was always about leveraging data applying the science and generating new insights and helping climate to develop and Test new product features in their digital platform for farmers and so I um my work spanned from Pest and disease management to carbon sequestration and soils and biodiversity applications identifying underperforming land areas that could more optimally be used for for Habitat restoration so that's a little bit about my credentials um and I joined AWS actually uh over a year ago um not quite a year and a half in July of 2021 um and at that time there was a new team dedicated to sustainability solutions for AWS customers and I was a very eager to um Advance the use of technology and help accelerate the use of Technology at scale to drive sustainability Solutions foreign a high level agenda uh I'll introduce AWS at a high level I'm not going to go too deep here but I imagine there's probably some people in the audience that aren't really familiar with the concept of cloud computing or what Amazon web services does um I was familiar with AWS I used AWS while I was a researcher but never put much thought into why so I want to introduce Some Cloud Concepts at a high level in AWS at a high level and then talk a bit about um sustainability at Amazon and AWS because that sort of sets the foundation for the work that I do and the remainder of the presentation where um I'll dive into how um you can leverage cloud computing on AWS to accelerate the scientific process and The Innovation cycle um more specifically around accessing and utilizing Big Data at scale and facilitating large-scale compute workloads and collaboration across geographies really globally and then I'll end the presentation with a series of customer case studies um I'm probably going to end up using some jargon that's industry specific or Amazon specific customer is one of those words that we use pretty religiously it's kind of built into Amazon's culture that we work backwards from the customer and we always put the customer first and the customer can be anyone or anything that relies on us or our services and so customers are AWS users they can be stakeholders they can be our customers customers or clients and so I just wanted to put that out there when I use customer to give you that context right okay so um starting it off with um high level cloud Concepts cloud computing in a nutshell is the on-demand availability of data storage compute resources and networking resources ideally you're not thinking about these things too much you're just able to make use of them but with the increase exponential increase of data available from many different sources and different and increasing demands of computational workloads that are being required for the average scientist or the average resource Searcher the compute infrastructure needed can be a high bar to meet and the skill sets you need to leverage it efficiently um take up most of your time and we'll talk through that a little bit more in some of the use cases here but cloud computing is often used um in the context of hyperscale computing um and so hyperscale refers to the ability of any computational system to scale appropriately as the increased demand is added to the system whether that's data or computation and hyperscale clouds like AWS we have functions that are distributed over multiple locations and each location is its own Data Center so this is a an illustration of the different AWS regions and each region can have multiple data centers and availability zones a data center is actually like a physical building with a lot of computers a lot of servers in that building it's a dedicated um space for hosting uh compute and so when you're subscribe to AWS and you're using uh cloud services you're accessing those computers over the Internet um and so AWS has 27 specific regions globally and as I mentioned each region has multiple availability zones and that's attached to a data center an actual physical location so the cloud isn't um just a imaginary thing there are actual physical locations and computers associated with um what the cloud is and what it provides so 27 launch Regions 87 availability zones and we can get into more specifics around actual local zones and smaller scale and Direct Connect where you actually have a fiber optic cable like directly connecting for faster speed compute and data transfer there are 410 over 410 points of presence um and this basically means we're accounting for Edge locations and Edge compute is like um where you you have um kind of offshoots of the core AWS availability um regions so I don't want to really get into much more detail on AWS infrastructure I don't think it it's that useful to this audience um but I do want to just introduce the AWS Cloud as this Global infrastructure of compute that is available for on-demand use so besides the infrastructure um we also provide Solutions and Frameworks to help optimize computational workloads and AWS has published what we call our well-architected framework and there's traditionally been five pillars as of uh just about a year ago we added the six pillar of sustainability but each of these pillars represent our Focus areas in developing new services and new capabilities so that our customers AWS users don't have to think too hard about security or the reliability or the efficiency or the cost of their infrastructure we believe that each of these pillars helps um reduce the undifferentiated heavy lifting of the I.T infrastructure so that our customers can focus on the applications to meet their needs I'll talk a little bit more about the sustainability color in a few slides before I jump into an introduction to sustainability at Amazon AWS and how we approach these challenges I want to just check to see if there's any any questions okay so I find it useful and in giving presentations um it's generally well received to to introduce what what we mean by sustainability coming from Amazon at AWS a lot of people just think about uh the e-commerce platform of Amazon but Amazon actually has many many different businesses and AWS is just one of those businesses and some of this means that we have to talk about the culture at Amazon and our leadership principles um just before I joined AWS actually less than a month before I joined Amazon added a leadership principle and our leadership principles are kind of guideposts for for making decisions and prioritizing action at AWS and so there's historically I mean they they do get updated over time not that frequently but um this and another leadership principle was added just last year to embody um the responsibility that we have to our customers to the planet to our communities um to to help to to help uh create the new solutions that are needed for a more sustainable future and our responsibility to to be sustainable in the services that we provide in 2019 Amazon uh co-founded the climate pledge with global optimism and this is our commitment to achieving Net Zero carbon by 2040. uh the significance of this is that it's 10 years before the Paris agreement and the climate pledge is sort of uh setting our guideposts for sustainability action they're really just Three core principles and the the first of which is regular reporting and disclosure um publicly for our Scopes one two three greenhouse gas emissions and other sustainability impacts like to date uh and still the focus of the conversation around sustainability is very heavily on emissions and greenhouse gas emissions but that narrative is starting to shift to be more holistic around ecosystem health and why actually like taking a a step back and understanding why we're reporting emissions and maybe that's just a proxy and understanding the risk associated with it on businesses and the impact to the environment of our businesses uh the second principle is carbon elimination and this is where we focus most of our effort um in reducing the the emissions and the impact of our businesses and then the third uh principle is um use incredible offsets to neutralize any remaining emissions and I know that's a Hot Topic and we'll talk a little bit more about what that means and how we're approaching it at Amazon but the gist of it is that credible offsets they need to be obviously additional they have to be quantifiable and they have to be real and permanent and they should also have uh socially beneficial attributes associated with them so this is just a quick highlight slide um that we like to share on Amazon sustainability Journey um from you know disclosing our greenhouse gas emissions and Publishing reports uh sorry was there a question no okay um there was actually one question in the chat okay yeah from Josh he says it's Auto scale example on ECS just a specific method or for hyper scaling or are they disconnect concept distinct Concepts sorry yep so Auto scale is um one way to achieve hyperscaling but hyperscaling can mean a lot of different things and so we would have different architectures or patterns for leveraging services on AWS to meet a particular hyperscale need and so if that's a data hyperscale application versus like a modeling focused application or maybe you need to to stream data in real time to many different geographies Auto scale for those that aren't familiar is really about the ability of AWS when you launch some services and you have and you're in AWS and you have a workload running and then the demand on that workload increases suddenly it can automatically scale up and add more server capacity so that your computer doesn't crash essentially your workload doesn't crash um so that's one uh tool that we we provide on AWS to help with hyperscaling um okay so the the highlights I I want to point out here is that Amazon has become the world's largest corporate buyer of renewable energy which has a huge impact on reducing um our emissions from energy usage in our operations uh and then Amazon obviously has a large delivery Network for distribution of goods purchased on our platform and so um funneling our our money our money our Amazon is funding new solutions by investing um through the purchase of electric vehicles and through dedicated programs um that our actual Venture Capital funds to help our customers um and partners and third parties uh to to innovate and to drive new solution development um okay I think that's all I really wanted to touch on here but uh we'll provide these slides if you have any interest in diving more deeply into any of this you can find our our public uh sustainability website rather easily just uh Google it I shouldn't say the word Google but uh a little bit more on renewable energy we currently have over 379 projects around the world uh to generate uh renewable energy and this puts us on a path to achieve um 100 renewable energy across all of our operations by 2025. um and this is actually a lot faster than originally uh planned we were originally targeting 2030 to hit this target but we're now on a path to achieve that by 2025. this has a lot of significance to AWS sustainability because the the bulk of our impact is the energy use of our data centers and of that compute so I have a few slides to talk a bit about Amazon's work in nature-based Solutions and elaborate uh a little on the the third principle of the climate pledge for generating credible offsets um offsets are just uh one uh component of what nature based Solutions can be but in general the nature-based solutions program is really science driven um led by scientists PhD scientists uh the aim is to really follow um the evidence and the scientific research in order to funnel funding to initiatives including research initiatives and Innovation projects that are most critical to staving off the most catastrophic effects of climate change um we focus on supporting the systems the tools and the development of new scientific knowledge uh to support scalable business models for nature-based carbon removal and we focus on large-scale Transformations that would not be likely to occur without significant new investment a few examples of where we've uh dedicated funding to nature-based Solutions include supporting family forests in the U.S more specifically the Appalachian mountain region of the eastern U.S this region has been identified as being disproportionately important for conserving biodiversity and mitigating climate change um and Amazon has committed 10 million dollars to help Kickstart family forest uh programs so that family forests especially on the the east coast and eastern U.S um most of the Forest land is um privately owned whereas in the in the western U.S there's much more publicly owned land um so this program is dedicated to helping families that own forests to sequester carbon and this uh is in collaboration with The Nature Conservancy another project is um focused on nature-based carbon removal in Brazil uh with the nature conservancy again Amazon launched the agroforestry and restoration accelerator with the goal to restore native rainforest to naturally trap and store carbon and mitigate climate change this also creates a more sustainable source of income for thousands of local farmers and then we are also investing in green cities or smart cities through Urban Greening and urban forestry programs and this highlights a few different commitments that we've made across Europe in the UK in Italy and in Germany uh there's a lot of benefits to to Urban Greening um including uh especially when it's focused on communities that are otherwise underserved um that often have um higher exposure to pollution uh in various forms and that lack access to Natural spaces and then from the environmental perspective increasing Urban biodiversity improving air quality um and improving storm water management for example I'm just reading the question here that came up in the chat focus on when Amazon works on projects focused on carbon sequestration either to offset its own operations as part of the Investments what kind of verification monitoring are you doing okay um I can't answer that question uh in this forum but I would be happy to follow up with you offline David if you'd like to email me uh your question um it's not that it's proprietary I just would need to confirm like what I'm actually allowed to say about that foreign all right and so this is the last one I want to highlight and this is actually um like kind of a big one uh the leaf Coalition lowering emissions by accelerating Forest Finance the leaf Coalition Amazon uh was one of the the signatories in creating this Coalition along with Unilever sap GSK and probably uh several more since it was originally formed this was uh supported it's a the program in the Coalition is supported by the science-based targets initiative and the goal is to really halt deforestation by financing large-scale tropical forest protection projects okay let me pause um before moving on to AWS sustainability and cloud computing for sustainability if there's any other questions feel free to to come off mute if you don't want to put it in the chat okay uh so AWS and sustainability we bucket into migrating to AWS optimizing workloads on AWS and transformation leveraging AWS for achieving your sustainability objectives whatever they may be for migration um there's a lot of reasons to Leverage The Cloud as opposed to on-premises infrastructure for the purpose of achieving more sustainable IIT infrastructure AWS joined the data center industry in Europe for creating the climate neutral data center pact this is an industry commitment to Pro be proactive in leading the transition to a climate neutral economy regardless of where you are in the world um migrating to the cloud uh for the average uh compared to the average on-premises um owned like privately owned data center um results in improvements of efficiency Energy Efficiency and thus reducing the carbon footprint by up to 80 percent with uh the path AWS is on Amazon is on uh to achieve 100 renewable energy across our operations that would translate again on average to 96 Improvement in um carbon footprint of your data centers AWS also sorry I'm just moving some windows over here so I could see my notes our Global infrastructure is built on aws's own Hardware and this includes purpose-built servers routers processors and we have improved the power efficiency availability over time the AWS graviton in inferential or the latest generation of AWS design processors that are built for the cloud and they provide up to 60 percent less energy use for the same performance than comparable instances or server types this is our most power efficient uh processor and making this kind of processor available to our users and their decisions around workload um architecture and execution is one of our ways that we help our users optimize their Cloud for sustainability AWS also has multiple initiatives underway to use water more efficiently and also use less potable water to cool our data centers we use real-time sensor data to adapt water use to changing weather conditions our water use strategies are very location specific and Watershed specific as they necessarily have to be and they take into account the local water management and availability patterns AWS is the First Data Center operator in Northern Virginia Virginia that was approved to use reclaimed water uh with direct evaporative cooling technology and this basically means that they don't have to use drinking water um for cooling the data centers and it's surprising um when if you're not familiar uh with utilities and regulations um it's surprising how difficult it is to to be able to achieve this outcome in another U.S region um AWS partnered with local utilities to build uh new infrastructure that enables the ReUse of 96 of the Wastewater that are that's discharged from cooling of our data centers so water and energy are both used for cooling uh the data centers and there is a trade-off there um if you use less energy you often need more water and vice versa and so it's important to consider energy use and water use um together a few more highlights on water stewardship we AWS is working with Swedish Municipality of oh gosh I'm sorry I'm not going to be able to pronounce this um and local water supply companies to support and upgrade to the town's storm water infrastructure this creates a new Wetland just outside of town and so the new Wetland and storm water infrastructure Improvement that's supported by four million dollars of contribution from AWS the project is due to complete in 2026 uh we also have a project in Ireland that uses direct evaporative cooling systems um to cool which utilize outside air to cool the servers and this basically means that for 95 of the year um the AWS data center doesn't use any water at all to cool its data centers okay um so briefly touching again on optimization and this is uh about workloads and I.T infrastructure that are already in the cloud foreign earlier the AWS well architected for sustainability uh pillars the well architected framework helps Cloud Architects build secure really high performing resilient infrastructure for a variety of different applications and workloads I mentioned we traditionally have five pillars and last year we created the sustainability pillar as as I mentioned sustainability is fairly tied to energy use and also tied to cost optimization but it was worth it to for us to call out sustainability specifically in part to me customer demand and in part to meet our own internal commitments it's important for our researchers in our developers to also be considering sustainability in the development of our our applications as we work with our customers so the well architected framework it's an internal tool as well as it is um an external tool to help optimize workloads the sustainability pillar does ADD and enhance the framework by providing a way for users to consistently measure their architectures against sustainability best practices for cloud computing and then identify areas for improvement so in practice um the building of sustainability into cloud workloads is about understanding the impact of the workload sorry I just realized my mouse is moving around understanding impacts quantifying the impacts and applying best practices to reduce the impacts and we provide tips and guidance without going into too much detail are largely centered around improving the power efficiency choosing serverless and this means that if you have a workload or a job you don't need to spin up an entire server for that job you can choose a serverless deployment model that only uses the amount of compute that's needed for that job so you don't have any wasted compute resources um integrating our instance scheduler so that servers are shut down and terminated when they're not in use and our cost Explorer also helps customers to to right size the recommendations of their workloads and as I mentioned before Auto scaling is another way to help customers right size um because if they do need additional compute resources it can automatically scale up and then automatically scale down so lastly I want to mention the AWS customer carbon footprint tool and this is was also announced a year ago at re invent and this is a interface in the AWS console for users to to track and measure um and even forecast the emissions associated with their AWS usage so we're calculating the carbon emissions um for AWS workloads uh you can also understand the historical Trends in carbon footprint and you can see that alongside um the the cost optimization so you can make the best decisions for any uh given application and then lastly we also provide forecasting so that migration opportunities in particular large migration opportunities you can see the forecasted Trends and Emissions um as the migration moves forward and AWS also at the same time is moving toward 100 Renewable Power okay so the second half of my talk if I'm not running too slow will be focused on on transformation use cases of the cloud for science um more broadly and then some case studies in biodiversity and nature-based solutions but real quick uh check in to see if there's any questions okay sorry was there something no okay so when we talk about transformation um we often use the term digital Innovation um and that is a way to accelerate transformation and sustainability transformation is really about the innovation of new Solutions um to accelerate the path toward a sustainable future whatever that may mean at AWS we provide resources to our customers and users to basically leverage um Amazon learnings and Amazon processes and mechanisms to innovate quickly with a plan um and in a way that allows you to scale fast fail fast and iterate so the first step is identifying customer needs brainstorming generated ideas evaluating the opportunity against whatever kpis make the most sense whether that that's cost or sustainability metrics or even user experience and the number of users reached Innovation is the the center of um this process of course and this is about discovering Solutions and doing it in a collaborative way um with AWS technical Specialists and Main Specialists and with our customers and the Amazon partner Network um by selecting a solution or a series of solutions we can then accelerate the development the actual build the technical build of a prototype and then implementation it's a last stage here but really this is a circular process where you then deploy the solution into production capture whatever kpi metrics you need to evaluate that and feed that back in foreign to highlight this because this is um really the first step in many of the case studies that I'll describe in a bit from idea to implementation AWS and Amazon more broadly um is stepping up to to meet customer needs to help Define and plan um and design really new Solutions so this is a silly schematic that shows like the the typical workflow and it can go back and forth and around in circles but a bulk of it spent in Discovery and very little time invalidation and then launch and continuous Improvement is just a straight line here where things move very fast and there's a lot of technical debt that can be accumulated I'll come back around to this question at the end if that's okay and so we have Innovation programs with dedicated facilities facilitators to help our customers and our users walk through this process in an efficient way we use a design thinking approach we call it working backwards it's very similar to design thinking Concepts if you're familiar with that but working backwards is a problem to shape Solutions and execute and then develop the mechanisms for continuous Improvement foreign we also provide a Specialists both I'm using sustainability very broadly here because it encompasses a lot of different topical areas for different Industries but sustainability is often a horizontal need across Industries and sectors so we have a technical domain Specialists working closely with Cloud Architects and Engineers to shape Solutions and by Solutions here I mean reusable patterns that architecture the blueprint for a common use case of a solution that can then be customized and tuned to a particular need and so the oops sorry The Innovation cycle that we provide also helps our customers to to shape the solutions that we provide at scale more broadly globally as well as future Services um and it is a two-way street so sustainable agriculture is a core use case for our customers um and this is often challenging because Farmers themselves are typically not AWS users and so we approach the agriculture industry and our customers that are within the the bubble of the agriculture industry whether they're providing crop inputs seed fertilizer maybe they're providing digital tools to Farmers maybe they're providing financing or Insurance how do we provide them with the tools to accelerate a sustainable future at its simplest um sustainable agriculture is about how we cultivate the Earth to meet humans most basic needs food clothing and shelter of emphasis here on food um because most of uh the cropland um globally is used for food production but we also need to protect our future Generations ability to meet those needs biodiversity and nature-based solutions are often used in the context of conversations around sustainable or regenerative agriculture which is why I wanted to highlight this but we know sustainable agriculture is a multifaceted problem and the conversation historically has been focused on topics like carbon credits or biodiversity and putting that in air quotes um because they don't have it on the slide that doesn't really uh take account of the underlying challenges and ultimately the goals so we're working to shape the solutions that will better address True Farmer needs and less focus and we do have teams that are focused on the agriculture industry needs as well but um creating space to help those customers that are directly addressing farmer needs without regard to profit is there a question okay that's the same question so just to uh iterate again um sustainable agriculture it faces many challenges and we don't have much more land without clearing forests we can't clear forests without adding more carbon into the atmosphere we need to use less water less fertilizer which is a huge source of emissions and less energy in producing our food we need to reduce the the footprint of Agriculture food production itself accounts for over a quarter of a global greenhouse gas emissions and a majority of that is in the actual crop production and also a huge part of that is animal agriculture um where agriculture is also very exposed to environmental conditions and so we have less predictable weather patterns looking into the future of climate change um and we need to reduce the risk to people and communities and the planet from our production systems and so to help address these needs um we start with data and I'm going to break this up into to data and compute and modeling and so starting with data the challenges that researchers and policy makers are facing globally boiled down to some common uh Trends in data and that's you know the volume of data is growing exponentially it's coming from a variety of new sources and also old sources um that need to be then digitized and standardized uh it can be raw data it can be derived data and then you have issues with reproducibility and versioning many different data types different data qualities that's not always Quantified increasingly diverse user needs of the data um in many different applications um that are processing and analyzing the data at scale so we aim to help customers effectively capture find access and utilize um the data they need to build new Solutions and a core part of that is the Amazon sustainability data initiative which is an investment program dedicated to increasing the accessibility of sustainability data on AWS foreign so there's many different data layers that are kind of highlighted here this isn't meant to be exhaustive um it's actually a little bit tuned to Agriculture and land use sustainability use cases but several of the customers that we've supported in making data available for free publicly on the cloud are Illustrated on the bottom here if you'd like to explore data sets you can just go to registry.opendata.aws this includes asdi datasets as well as broader open data sets and the power of this is it's it's available for free um it's hosted on uh in a storage bucket and S3 anybody can access it um it's public it's also optimized for cloud computing um and this is most relevant for geospatial data sets uh you want to have the data formatted in a way that makes it easy to get only the data you need in an efficient manner without pulling down very large cumbersome data sets and extracting pieces of all these data sets at the same time um so we make the data available for free in a format that makes it easy to use um and often you can uh you know schedule a job that uses the data without ever having to download it to your your local servers or to your instances so from data we can talk about actually utilizing data in the cloud in a more effective way to accelerate sustainability action and when designing a solution we typically have four phases in the process um ingest transform learn and act I'll briefly explore or share some architectural diagrams and Concepts but I didn't want to go too heavy with architectures uh I was certainly wasn't familiar with architecture diagrams until I joined AWS but these are pretty high level flow charts S3 is um most people who know anything about AWS are familiar with S3 it's probably the one of the that along with ec2 are the most foundational services that we have it's really a storage service for data storage um so ingest can data can come from any different uh sources I mentioned asdi and open data that's third party data that's out there and available and you want to use it um so maybe that comes in from AWS open data in Amazon or AWS data exchange which is a subscription service that you can access all this data via API and maybe some of it's coming from iot data at the edge and so we have iot services to optimize the use of sensor data and formatting that and storing it on S3 and then maybe have some other on-premises data whether that's your um own model outputs for example that you have stored locally um maybe it's um more metadata that you need uh to parameterize modeling and so we want to be able to optimize the storage of the state and getting it all into one place transformation is in this context more about the the processing uh whether you are streaming data and making that available whether you're doing big data processing a lot of scientific research leveraging climate and weather environmental data layers falls under the bucket of big data processing whether you're making that data available for querying for other users via API or through a user interface transformation is the second bucket and learning this is a simply put like machine learning and AI but can include any type of modeling really statistical modeling anything where you need to infer an outcome or a scientific result and so we provide a series of services kind of highlighted on the top uh here to for different types of use cases whether it's natural language processing or image processing or forecasting and our users can leverage these services to again reduce the undifferentiated heavy lifting of machine learning applications and cross-validation and then the fourth bucket is action making uh any solution able to trigger an action and so AWS Lambda is highlighted down on the bottom right here and that is a serverless a serverless service that allows you to send a command as long as it can be computationally triggered or maybe there is a user interface where an alert is given or an SMS text is sent um but most often we're talking about Control Systems um HVAC is a great example where maybe you want to send a command to the system to to change or to stop or to start uh this um action can also include visualization and services to provide dashboards and applications for actual people to take action and then of course some exposing apis so action of course perturbs the system so the cycle always repeats but architecting in these kind of buckets allows you to continuously improve and make better decisions all right I'm going to talk a bit about some customer case studies but let me pause I think there was one question in the chat but I've lost my chat window I can read the question okay from salvadorai if I understood correct you indicated higher efficiency sells for higher water use I thought higher efficiency for example efficient process processors would generate less heat and in turn should use less water for cooling I am wondering if you could clarify this thanks I think I missed the first part of that question could you repeat it if I understood correct you indicated higher efficiency sells for higher water use I thought higher efficiency for example efficient process processors would generate less heat and in turn should use less water for cooling I'm wondering if you could clarify this um and so I I think I had said that uh energy and water um kind of go hand in hand uh rather than efficiency but maybe I'm I'm misunderstanding the question um or I don't I don't recall when the the question came in uh but energy is used uh mostly energy used in data centers is for for cooling um and if you want to reduce energy uh but still maintain cooling there's a trade-off with using water for cooling um and so I think the question was talking about more using more efficient Hardware uh and yes that that will reduce the need for cooling so that's um I think a little bit then like that's another approach to reducing the overall energy demand um by using more efficient uh Hardware but uh in terms of the cooling itself you either need to use air um or energy that uses air uh or uh water-based pooling methods I hope that answered the question but feel free to follow up in the chat were there any other questions no that's it okay we have a question from Purdue sure hello do you hear me yeah yes I am from Purdue uh I have a question about the uh advantage of using AWS service for Quality Beauty so as a research research scientist at Purdue I use high performance Computing resources at Purdue and it's quite useful I mean it's kind of kind of affordable and kind of easy to use but uh most of the researchers in agriculture outside of interior engineering would use Standalone machines or at most high high throughput Computer Resources uh I wonder if using AWS has some advantages with kind of computing efficiency or the amount of data storage or kind of a how easy to share data with other researchers or people outside Academia you know one of the the uh usual I codes to do the uh use the AWS services all right I think I caught most of your question um and I think that this case study will actually address it a little bit a lot a lot of researchers still use on-premises HBC compute clusters and they're purpose-built and we don't want to and don't need to replicate all of them there are ways to to spin up a cluster in the cloud and to provision it and to make it available reliable and usable for scientists but that's a little bit deeper than I'm able to share on that I think that between on-premises infrastructure whether that's for a research compute clusters or other corporate or corporate use cases the cloud can be very supplementary there's a term that we use called hybrid cloud and we have services and features that are dedicated to optimizing hybrid Cloud for for individual researchers if we were to connect the cluster where the data is reward or the models are stored and make that data available via API API then a researcher can obviously pull that data into their computer you could also do that entirely in the cloud if you upload data to the cloud um then a researcher doesn't really need a heavy duty machine that can handle that amount of data or compute and they can large fod resources to do that and they just need an internet connection essentially you could use a Chromebook and so there's power in using the cloud for that kind of application because then the individual scientists they don't need much more than an internet connection to be able to to access the data to run the compute and you don't even need to use the same machine from time to time as long as you have you know the login credentials to the account um does that help answer your question yeah yeah that's a general I I prose of using cloud computing you know yeah it'll be really nice if you could compare the codes or yeah kind of process would be when using Standalone machine or AWS but if there's no nice a a a number on your size okay I'm sorry I didn't really catch everything you just said there okay so yeah uh you mentioned uh most about the general pros of using aw yes or [Ê__Ê] Duty for researcher uh but it'll be really nice if there is kind of a closed or monthly subscription pay to use the Amazon a web service for research uses at the end of the discussion or some time later yeah let's follow up so I can make sure that I understand your question um if you want to like I don't know if you can throw it in the chat just so that I can read it there's a I caught most of what you said but I wasn't entirely clear on some of the words that might have been defining in the question that you were asking but let me let me share this case study um maybe this will will help I think that you're asking a maybe a more specific question that we can elaborate on at the end of the talk or we can follow up on after um sure yeah so this is actually just part of a customer case study um and we'll be sharing this at um AWS re invent it's our biggest annual uh conference of sorts and that's happening at the end of this month so I can't share all the details yet and the name of the customer and um some of the specifics but if you're interested um the presentation will be public it'll be on YouTube um probably by mid-December uh so so feel free to um uh check it out then but I can share at a high level the customer this is a customer that collects creates and processes like petabytes of data every day environmental data um more specifically in this data in the model results and the insights is leveraged by researchers and organizations and policy makers across the globe so they were looking for a managed service to reduce the cost the time and like the actual resources energy and people resources associated with sourcing data managing the data processing it and sharing it and making it available to other resources so the heart of the challenge was to address the fact that scientists today spend a disproportionate amount of time on data munging and data management and this has been true for a while but that that burden is increasing as the amount and complexity of data is increasing and they were also you know seeking a solution to help improve traceability of the data processing workflow in order to accelerate the validation process for peer review and reproducibility thank you so this part of a scientist's job is critical um and you know if you don't have have that validation that traceability and reproducibility uh you're prone to to errors and accurate data or incorrect interpretations and so the goal here was to reduce the undifferentiated heavy lifting on the scientists to automate that that process for the data sourcing and management and compute so that scientists can recover more of their time to focus specifically on scientific discovery foreign I'm going to show this is an illustration kind of going from this scattered model where you have a scientists in multiple locations and in different locations they have different access to different data sets and are maybe Limited in what they can do by the data they can see and they can import quickly to a model that really orchestrates the data access and compute across AWS regions the geographic regions globally so each region would have its own virtual private Cloud that's what VPC stands for um and these would all be connected with uh it um Cloud infrastructure that makes it efficient for scientists no matter where they are to to view to discover uh data sets in a user interface um and then access that data uh regardless of what region they don't have to worry about where the data is located where they have to pull it from because it's all centralized in this one platform system so this improves the speed and ease of data Discovery and and loading it automates the distributed compute for large-scale modeling and Analytics and also automates um the efficient orchestration of the computational workloads across AWS regions this is a solution that's really uh valuable to Global collaborations so a fun Jupiter notebook here for illustration um a little bit I'm sorry this is a little bit repetitive other outcomes besides the the science user experience experience is the reduced compute time and the compute energy both for this customer and considering the compute resources that are used by all of the different users of the data set and so there's actually very significant savings estimated over traditional approaches to data access and with reduced compute time and a more efficient architecture significantly reduced energy that goes into that compute as well um I'm sorry I don't have uh numbers here but again um feel free to check out this session it's titled utilizing sustainability data at scale um and so that's happening at the end of November um and there'll be a YouTube uh video uh released shortly after that if you'd like to know more about this specific customer and use case so I think I want to make sure to save some time so I'm going to move kind of quickly through some highlights on um how AWS users are leveraging the cloud for more biodiversity specific applications nature serve um is a leading source for biodiversity data in the Western Hemisphere and has been for for decades um they work with uh a network of 100 biodiversity information centers and um over a thousand conservation scientists some of you may be familiar with nature serve um already so what nature serve wanted was to help address the need for precise species distribution data um and they developed this online geospatial tool on AWS uh and they it's called the nature series environmental review tool um you can check it out on their website I think they have a nice story map um and this can be customized for any given jurisdictions regulatory environment to help inform land managers and land use decisions from the very earliest stages of planning to the end of the Project's life cycle so this tool helps guide more proactive conservation decisions foreign systems is a technology service provider that is working to address restoration of natural ecosystems globally the unprecedented level of data in ecology and environmental systems um opens up the possibility of really harnessing artificial intelligence to help with efficiently inventorying ecosystems and identifying problems quickly so the action can be taken these problems can Encompass a lot of different things but plant condition stress invasive species like weeds species Decline and erosion are common uses here suniga is um the the national Center for geo-environmental information um in Costa Rica this is a technical intelligence unit and they're using mapping and modeling tools on AWS to integrate standardize and help communicate Costa Rica's and share out Costa Rica's environmental data and then lastly the Natural History Museum this is actually there's not much to share here but we pretty recently announced a partnership with Amazon web services the Natural History Museum to develop a digital twin for UK biodiversity building the data platform to store enrich and service Urban biodiversity data each of these um I can provide links for I kind of went through these really fast but you can read more detail about each of these case studies including more specifically how they leverage AWS on our website uh and I can share those uh links out after the presentation but with that why don't I um call call it good on the talking and open it up for more questions and discussion thank you very much Dr Carol that was a very impressive number of projects AWS is conducting with Cutting Edge technology and so yeah we're gonna move on to the Q a session so if you have questions you're free to raise your hand in a mutants talk to uh speak your questions yourself or you can also post your questions in the chat sorry y'all I've lost my screen here I've got this chat on my other screen was that I hope that wasn't blocking your view of the slides no no no it wasn't okay and in the meantime we have one question in the chat from Javier um I know you can go ahead transparency and data public dissemination must be inclusive almost by definition so AWS sustainability program taking proactive roles to reach those less likely AWS customers that happen to be the ones protecting large numbers a short answer yes um the AWS sustainability data initiatives uh specifically um is is investment is providing uh free consultation free Cloud credits um for high impact opportunities and the definition of high impact kind of goes back to the guidance around the climate pledge and really what what qualifies as credible and impactful that doesn't take into account the social aspects and would this have happened otherwise or not and directing our our resources to generate the most benefit in the long run so for AWS services and I think your question was about AWS sustainability more broadly and I just answered more specifically around the data initiative a broader AWS organization actually has a a team dedicated to social impact more specifically and that overlaps with the more um hesitant hesitate to say commercial but uh the more industrialized side of sustainability um and so the sustainability data initiative actually I mentioned the registry of open data that registry of open data is itself a separate program um and it's not specific to environmental sustainability but they uh help incentivize data access and compute resources and run a variety of different social impact initiatives that are specifically targeting underserved communities um I can look up uh some information more specifically to indigenous communities as well I just don't have that off the top of my head thank you I think there is a question from the Purdue room hi uh Paddy thank you so much for your presentation uh so I'm very glad to know what Amazon has done to conserve the nature and also protect the Earth and so on behalf of science I would like to see that we are very interested in what Amazon has been doing and I think we have a lot to contribute as well so when we talk about the data so science are in collaboration with our sister initiative gfbi which is global Forest about diversity initiative we have compiled and collected one of the first and the largest Forest eventually database across the world and that's a grand sourced data and then we are talking about we are having hundreds of our members of science I and the gfpi Great lead local crew members and so to go to the forest measure every single tree and then come out to a global database and that's custom Decades of time and also millions of dollars to to conduct all those big data and all those data very uh are critical for us to analyze and discuss the global foreign species across the world as well as we've mapped the distribution of three species locally across the world as well in the most high definition math as well so um with that in such many of the underrepresented groups from the global South regions we are facing difficulties in getting those data measured now especially after the covet and after the deficiency and funding and so we as slice are we were wondering if Amazon would be uh interesting for example to support the global Forest the data collection and uh foresting and the local Forest inventory as well yeah that sounds like a a pretty ideal use case for the sustainability data initiative and our our impact Computing team as well which dedicates more more build resources I'd love to follow up on that uh more specifically if you'd like to send me an email we can have a conversation that can pull in the right people from the AWS and Amazon teams to talk about what that could look like Yes sounds good are there any other questions yeah I actually have a question yeah um I wanted to know what it takes to um get an account um on Amazon web services and uh what Computing resources as an individual researcher um I can get if if I have an account mm-hmm yeah I think it you can just sign up for an account um you do need to like enter billing information but there's a surprising amount that you can do um on what what we call the free tier uh there's a certain amount of compute that's available to you for free um and so you can use that to explore AWS services to prototype Solutions um and things like that there's a ton of resources available including like tutorials um and a GitHub repository with different examples uh but I'm pretty sure you can just go to probably not uws.com is it it redirects aws.amazon.com and in the upper right you'll see uh create an AWS account okay and it's pretty straightforward from there it'll walk you right through it and then I'm pretty sure that it will also uh direct you to various resources for getting started um there's also I mentioned re invent before but um reinvent at this conference we do a lot of workshops and they're they're live in person uh workshops but some of them are are recorded and also uh posted on uh YouTube um a week or two after the event and so that can be another good resource for you for sustainable architecture and some use cases around leveraging climate model data and weather data that's already hosted on asdi for different use cases like identifying deforestation or predicting air quality and things like that foreign thank you very much sure I have a question sure oh yeah I'm Stanford it's not from Purdue um yeah as a research scientist again uh most of the individual researchers would do uh in jazz transform and Learn by themselves but I think the uh decays in the absolutely feel a little difficult I people might I make a team to do the in jazz eyes separately and make a separate team for transform and learn as well so if I don't know I'm I'm just asking uh I'm just wondering about your personal experience when you I wonder you could compare your experience when you're doing research in University when you're doing PhD and your experience doing some of the work regarding in just transform learn at Amazon yeah I'm having a little bit of trouble hearing uh the room um I caught you know a lot of what you said but I didn't think I got the core of your question around you're you're asking about my my experience um but more specifically could you repeat yourself or maybe put it in the chat if you have that ability uh can you clearly hear my voice right now yeah yeah I am wondering if you can compare your experience doing your research while you were in PhD and doing your research in Amazon web service if you have Gone song sure yeah my research experience um I haven't done much research in the year or so I've been in AWS but in my prior role um at the AG tech company I did a lot of research there and so I can definitely compare when I was a a PhD researcher we weren't using cloud computing we had a local cluster for distributed compute and I was limited to whatever um my local computer could handle like my actual workspace and then anything additional was a kind of a steep learning curve to understand the basics of parallel Computing and optimization with pretty limited tooling and guidance available internally to help with that um and so some of the biggest challenges that I really I think came to appreciate more later um because that was around the time that data science was becoming a thing and reproducibility as a challenge and issue in science research was becoming a headline um that uh the everybody kind of doing it for themselves you don't really have the what one you have a lot of redundancy um and two you don't have the the transparency and the visibility for validation and verification uh to make sure that as a scientist you're not kind of messing up the actual engineering processing components because you you probably learn to code just to get what you needed to get done and you didn't actually learn to code properly um and so there's uh a transition that occurred and I think it reflects the broader transition in the science and data science uh Community where more of the ETL jobs were handed off to specialized teams and those teams you know if they were if they were good they're providing the tools and the guidance for scientists to use that in my experience in a science team within research and development we had a Deja engineering team but then there needed to be a specialized team of data engineers and there need to be a specialized team of scientists to make sure that scientists are actually getting what they needed to get and then the engineering team often you know maybe if they like misinterpreted or they you know decide that there's some trade-offs in the quality or the speed of the data available it was a point of friction and so cloud computing and you know it's not unique to cloud computing it just makes it a lot easier because you have these these building blocks of services and it's sort of plug and play and you can package it up or containerize it and hand it to another person where they can just run and reproduce exactly what you have um or you version your Model results and that's stored and it's attached with metadata um that allows you to to track back exactly how you got to this derived data set um this is like Leaps and Bounds beyond what I experienced as a PhD student and it might have been that my bubble was too small at that point I'm sure that there were uh teams and especially large-scale Computing teams both in the public and the private sector that we're addressing some of these challenges but I think it's much more accessible now once you have the enablement on how to leverage it even while I was at uh Climate Corporation um we were the scientists were leveraging a platform service that even um obscured the actual behind the scenes of choosing the compute resources and optimizing that infrastructure for you so that we could you know just spin up a Jupiter notebook um and have a nice user interface that automatically has like Version Control built in and a peer review process built in um that's called Domino some of you might be familiar with it there's other platforms like that datapricks is one um and I'm sure there's several others and even some of those platforms and maybe all of them also obscure whether using AWS or another cloud service provider for the most part um and so we're moving in this direction of creating more of these tools and platforms so that you know scientists ideally can really just like focus on the work that matters most for scientific discovery and insights and driving innovation but there are still some hurdles to overcome you know anytime you simplify a system or create a derived data set there's like built-in assumptions that can be problematic later so it's really about um in my mind um enabling the transparency and the reproducibility and the traceability so that you can see how you got where you are that was a long-winded answer thank you so much for the uh your your explanation by your experience thanks so much there's a question here in the chat about um scientists or not software engineers and our our code tends to be rough and messy um what would the process look like to migrate an analysis from running some process locally yep lift and shift anything that runs locally you can run in the cloud it's not going to magically make your code better but we do actually I know that there are services that are designed for like automating like code review and making Smart Suggestions and um I know there's some Services built on AWS and there's tools and plugins more broadly that you can leverage um but uh yes I can definitely relate to that I actually postponed my dissertation so because I knew at that point I wanted to move into industry uh to do a Google summer of code internship that was pretty much a boot camp in um software engineering for for industry and for production um and so I think some of that is also you know providing the right training to scientists in their undergraduate and graduate careers so that they don't develop that sort of a technical debt of sorts um and have to go like relearn how to properly develop code yeah that's my two cents on that thank you Dr Carol I think it's about the time so it's a good time to wrap up and so I like to take a moment again to remind you of the white paper we're developing let me post in the chat briefly yes so yeah please leave your name and email again if you're interested in joining the white paper and I will hand it to Dr Liang at Purdue to give a closing remark thank you colleague and it is uh our great pleasure to host this uh wonderful webinar today and so uh we find I we are here to enrich diversity of about the risk sizes as well as Forest Sciences by empowering everyone especially the underrepresented groups to do great research and to this end the cloud computing as well as the associated data infrastructure is a key component to the success of this collaboration and we are very glad to start station with the Paddy arrow and also the Amazon web services about building up a collaboration between science I and maybe the AWS and for for future research projects and uh to also to this and we will be working together towards this white paper just uh thinking about how we can strengthen the research of the underrepresented communities by building a strong collaboration between the industry Partners like PWS and Academia so to this end we will welcome your inputs and your continuous support of science site as well as our data compilation process and so finally I would like to uh how do you control again for this wonderful talk and uh so uh with that we will be closing today's webinar and we will be uh looking forward to continue working with you on the bike thank you very much thank you everyone