DroneCamp 2020: Vegetation Analysis & Classification in ArcGIS Pro

Iryna Dronova: Hi everyone Welcome to this morning’s session, my name is Iryna Dronova I am a professor at UC Berkeley, and I will be your instructor for today’s vegetation analysis session I’m really looking forward to it Very quickly, I’m just going to stop sharing my screen to say hi to you all Hi, welcome to the camp. If you don’t mind, maybe you can type a quick message in the chat window saying your name, where you’re from and what is your level of comfort with ArcGIS Pro, one being not really well familiar, five being pretty comfortable and using for quite a while For example, I’m gonna do the same for me to everyone. So I am– Maggi Kelly: Iryna, Maggi here. Might I ask if people are comfortable sharing their video just for the beginning so we can all see each other’s faces? Iryna Dronova: Sure Awesome. Hi again everyone, nice to see you all, and welcome again to this session People might still be joining so I’m gonna wait just one more second before starting the slides I hope everybody is doing well, okay, I hope this week has been going well for you as well While we’re waiting to get started, I’ll just tell you a little bit about myself I have been doing various kinds of remote sensing work. I was a former student of Maggi and some other professors at UC Berkeley where I got my PhD in environmental science, but with a lot of emphasis in remote sensing And nowadays I work with very different types of data from, satellite images to aerial images and now drones as well. So as many of you, I’m also really excited about this whole advance in geospatial technology and exciting new opportunities that it offers us, but we’re also trying to actively learn new tools to be able to navigate new challenges that come with some of these data, as we will also see in today’s exercise So I’m really excited about the camp and I’m thankful again to Maggi and Andy and Chippie for inviting me to be an instructor for this year Vegetation analysis is one of the very popular and much desired applications, but of course it’s not a single question of how we detect and look at vegetation from these data, there’s a whole bunch of different areas, purposes, for which people may be interested in doing this. And hopefully some of the tools that we’ll cover today will be universal enough that you could see how you could apply them in your work or study or some applications that you might be thinking about Welcome again. So it’s exciting to see again such a variety of different participants and backgrounds in as I can see popping up in a chat window. I’m going now to share my slides for a very quick introductory overview of today’s exercise, and then we will dive right into ArcGIS Pro So I’m gonna start sharing my powerpoint screen right now So our topic today is vegetation analysis and classification in ArcGIS Pro. Just as a quick reminder, please make sure you have the data for today’s exercises downloaded and unzipped, ideally in the folder on your desktop so that it’s fairly easily accessible to you During the session, please stay on mute If you have a problem or difficulty with the software, please type in the chat window and we will help you sort of individually, or you can also raise your hand using the participants window button If you get badly behind please try to watch and catch up at the next checkpoint We will have several checkpoints during the exercise and at some of them you will also have a chance to ask some questions directly if you’d like to speak up with a question as well. Or again type them right then in chat window and we will address them as we go Okay so let me go back to my screen So for today’s, we have just done this little exercise for the introducing ourselves, and I, myself, have not been using ArcGIS Pro for that long. So you may have noticed when I type my answer I put four This software is very similar in many ways to ArcGIS Desktop that I am very, very closely familiar with, but it does offer a few very interesting novel features

that actually will be very handy in this exercise and represent a really interesting way this technology is moving forward and helping us tackle more challenging new data, such as very detailed drone imagery. So again we would want to download and unzip the data for the exercise, save it in a desktop folder. If you look into what the data contains, you will notice that it has a folder named ‘data’ in it and also the kind of a folder with the outputs The data has one remote sensing image that I will introduce shortly, which we will use in today’s exercise, but the other folder contains outputs that you will hopefully be able to derive today But, if for any reason, you run into computer challenges or software is too slow, you can plug those in and I’ll tell you when we get to those points so you don’t have to wait until your personal machine finishes doing something. If it takes too long and if it puts you at the risk of being behind, you can actually use these pre-made outputs because they would be identical to what you would have produced as a backup for your exercises. Okay. So moving forward, our goals for today are exploring several strategies for detection and basic mapping of vegetation We will practice computing a spectral index. We’ll practice an interesting technique called image segmentation and learning how to adjust it to different purposes of mapping and particularly vegetation analysis And then we will also try a more complex classification workflow where we will then reclassify outcomes into a vegetation cover dataset. So this is just a quick overview of the steps and you have an exercise file that has these steps outlined in more detail So the way today will be set up is that we will do it together going through all the steps and discussing various outcomes and nuances of the tool use. And you also have that for you, so when you’re done with this or if you’d like to repeat this again or explore a little bit further, you will be able to do so outside of this workshop in the future. A little bit about the methods that we’re going to cover today, I’ll discuss them in more detail when we get to the actual steps, but just briefly. So these are three different strategies that are very common in vegetation analysis. One of them is called spectral index It’s a very basic mathematical conversion of an image into a single raster layer that basically tells you, it’s like a single image, that tells you where things are more likely to be like vegetation versus not. And there’s more than one way to do it, but we will practice a particular common index that people use often with digital photography, with drone imagery, it’s called green chromatic coordinate index. And people also use it in various ecosystem studies because it turns out that this index is sensitive to vegetation, biomass, productivity, greenness in the sort of plant physical sense that helps people understand how healthy plants are and how they grow and so on We will also practice, as I said image, segmentation, which is a technique for delineating regions within the image. And then those regions can be used as mapping units, instead of individual pixels or instead of individual grid cells And with very detailed drone data, this technique can be very very helpful for smoothing local noise and for helping us tackling this landscape in a more holistic way, actually getting at various landscape elements that we might be interested in as they are in their sort of geometric and conceptual entirety, rather than just working with little grid cells that sometimes can be quite noisy and speckled locally, as we will see And then we will also practice a more advanced classification workflow And this particular workflow is very interesting because ArcGIS Pro is pretty much the only tool, right now, that allows you doing it in a streamlined and user-friendly interface. Many other tools that practice similar techniques are available through different software packages but they’re not, you have to patch up different parts of the process, versus ArcGIS flow offers you a wizard tool that kind of allows you to plug in multiple steps into one process. And this becomes very convenient, particularly when you’re doing this analysis for the first time and you’re still exploring the landscape. So we will work a little bit with this technique and see how some of the insights of the previous steps can also feed into this workflow and help us map vegetation and discuss how we can

expand this analysis into more detailed vegetation analysis in the future The image for today’s exercise comes from an open aerial database It’s a portion of a drone photograph of a vineyard area in Paso Robles in San Luis Obispo County in California So wine county. It’s an interesting landscape because, for our purposes, it provides a variety of very common, recognizable and yet different vegetation types, such as trees, lawns, vineyards, shrubs, but also a number of non-vegetated types against which we would like to be able to detect plant features And something else that’s interesting about this landscape that will keep kind of popping up through the exercise is that it also has a few trees that are not green. This particular area here, as you can see, has a these two rows of ornamental trees, they’re alive, they’re look healthy, but they have purple vegetation You may have noticed similar trees in places where you live or work And they are not exactly green, even though they’re technically a lot of vegetation. So we will see how that can also affect our analysis and interpretation and how we can develop some workarounds when this may become an issue So without further ado I, am going to dive right into ArcGIS Pro So I’m going to stop sharing this powerpoint for a second and just direct you all to either your ArcGIS Pro, if you already have it opened, or to your start menu and then Pro You can type in Pro or ArcGIS Pro and click to get the software started, and then I will share my screen with the software to walk us through the main part of today’s exercise So now I’m going to share my screen with Pro Okay, so hopefully we can all load. So you need to be signed in I’m just gonna very quickly check the chat to make sure everybody Okay so, thank you Chippie. So we’re on page two of the exercise document now, and we will start as usual by creating a new project dedicated to this analysis. So if you are not signed in, please sign in So you can see I’m signed in here under my top right corner Arc Pro offers you a variety of options for how you would like to start. We would just like to start with the new, blank template. Let’s use a map for this exercise, so I’m just going to click on this map, and it gives me an option to create a new project. It’s a good idea to have a separate project for this exercise and call it something that you would be able to easily find later So I’m gonna call mine ‘veg analysis.’ And keeping this check box checked create a new folder for this project so that it stays in its own dedicated space So when we’ve done this, I’m clicking ‘okay’ Oh sorry. Well I’ll just do it another one, this comes from my practice of of the workshop workflow. So we’re creating a new project here and you should start seeing something similar in your screen– map interface coming up, gradually loading the data So by default on some of your computers Arc Pro might, similar to mine, load the topographic layer This is completely optional. It just comes as a default option on the map in case you would like to see where you are on the landscape It might sometimes slow down loading and processing due to the display demand, so I would suggest removing it. We could simply right click on it here in the contents pane on the left and say ‘remove.’ We don’t really need it because we’re only going to be working on this particular locality of our image. Now we’re going to add our data So we go into the map tab on top of the ribbon here in the upper left corner and we’re going to say either we can do this pull down menu, or simply click on this plus and it will give you an option of adding the data So we would want to navigate where we have saved the data for today’s exercise As recommended, I have it on the desktop in my DroneCamp folder

Sso we’re just navigating there and the image that we’re going to be analyzing as the primary data set is in the folder called ‘data.’ So if you navigate to what you’ve unzipped, go to that little data subfolder, double click on it, and there will be an image called ‘vineyardRGBs.tif’ Tiff, so it’s a tiff image, it’s a raster layer. We just click ‘ok’ and we load it in Might take a little bit of time so we could wait for this to happen UAV data have high spatial resolution very often. This particular image has spatial resolution of 5 centimeters. And what that means is if we’re capturing even a relatively small landscape, the data is still pretty detailed. It’s a lot of information that’s packed within these very small five centimeter pixels, so it may take a little bit of time to load this in. Just quickly checking the chat window okay Yes if you lost the tab contents for any reason or sometimes these side panels The view here, the view tab, allows you to bring this back. So you can always click to bring back contents, or something like that, to go back to this So now that our image is loaded, we can explore it a little bit The first thing I’d like you to notice is that it has three layers here displayed in red, green, and blue channels of the computer display. I’m actually going to close this panel here on the right too for now And you can see here that the image looks very much recognizable It’s almost as if you yourself were flying above this landscape and observing it from the above And this is because we have the data here what is called RGB imagery or an RGB stands for red, green, and blue This image has three digital layers in it If you’re familiar with the term raster, which stands for like a grid cell image dataset, there are three rasters packed in this image And we’re displaying them here in red, green, and blue channels of the display But based on the nature of this particular photo, it also happens that the rasters, these three RGB layers in the photo are actually red, green, and blue, meaning that they represent brightness of the landscape, or we also call it reflectance of the landscape in red, green, and blue light from the main light spectrum And that is very handy because this is exactly how our human eyes have evolved to see the world, our eyes emphasize the reflectance of solar energy from different surfaces and landscapes, specifically in those regions, red, green, and blue And therefore when we display them in the red, green, and blue channels of the computer screen, we actually see the landscape the way we intuitively have evolved to see as human beings with our human eyes Sometimes you can have access to data or instruments that have more different layers of bands. They may have a reflectance, for example, near infrared, which our eyes cannot see but it’s another very useful type of data for vegetation analysis But for today’s exercise we’ve chosen the RGB because it’s very common, it’s typically more affordable, so a lot more different cameras and instruments have it, including cameras on our phones and security cameras, and so on. And so it’s quite easy to work with this data because intuitively we can recognize a lot of things and we can do a quick check of whether our approach is working or not But even with such data, we can still perform really interesting analysis to highlight presence of vegetation and be able to map it So feel free to pan around, zoom a little bit in and out on this image. You can notice that this area here has a little bit more detail to it, there are trees, there’s some water,, trees shrubs, lawns, a lot of them look green or greenish, and so do these rows of vines One thing you can notice is that color is very important in telling us what vegetation is in this landscape It’s typically green and roads and bare surfaces are more kind of brown, beige, gray. But in addition to color, we can see that vegetated features can also have more or less unique geometry and shape For instance a lot of trees here are either roundish or they’re sort of in these elongated rows, clumps that also have a characteristic shape Vineyards are these very long

features, so the vines are growing together, they’re overlapping, sometimes they have fuzzy boundary when the branches are protruding outside. And so in addition to color, we can also recognize shape, geometry as something that helps us, at least visually, tell vegetation apart from other cover types. Just quickly checking the chat. So we can also look into these trees that I mentioned earlier. They do look like normal trees but they are a little bit purple, so we will see what that can do to our ability to recognize vegetation features on the landscape So how do we deal with this information? What is packed here? How do we go from an image like this to an indicator or a map or a cover? The first basic step to do so would be a vegetation index calculation. Before we calculate it, I’d like you to click anywhere in the image in a what looks like a green vegetated space, for example, this tree here And when you do so a little pop-up window comes up. So I’m just going to do this again to be sure. What you’ll notice is that you see these values here. These are pixel values for that pixel where I clicked For each of the three bands, red, green, and blue, or layers or channels as we call them here. And you can notice that if we click on green features, the green value is a little bit higher than red and blue I just wanted to point this out because this goes back to the earlier notion that different surfaces interact with solar light and they do so differently. Plants are unique, among other features, because they utilize solar light for photosynthesis, for growth, for production of their biomass. And they particularly like doing so in the red and blue regions. They have evolved to particularly make good use of red and blue light. And therefore they take a lot of red and blue light and they don’t reflect as much, but they do reflect a little bit more green They also use green light but less than red and blue and therefore this reflectivity or brightness in the green region often happens to be higher. And we can see so from clicking on the green pixels. We would see that this green reflection that is actually loaded in the green channel here tends to be stronger and higher value than for red or blue If we do this for other surfaces, let’s click on this bare soil between the vine rows, you can see that this is no longer the case Here red happens to be the highest on the road, it’s actually the blue and so on so Plant are kind of unique in that sense and this is what this spectral index, the green chromatic coordinate that we’re going to try right now, is exploiting and so do many other indices You may have taken some other exercises through this week that also showed you other forms of vegetation indicators. There’s more than one, there are a lot of different ways and they keep evolving But the one that we’re going to try today is fairly common, specifically with RGB data and it only requires red, green, and blue bands So now, just to orient ourselves where we are, moving to the part 3.0 on page five of your exercise, compute a basic spectral indicator of greenness. In ArcGIS Pro, we will have to use individual layers, individual bands to develop that formula And we cannot do it from the image displayed like this composite we have to load each of those layers individually So we can do so by again going to ‘map’, ‘add data.’ So clicking on our ‘add data’ and rather than clicking on this image as we did before, we’re actually going to double click to get inside this tiff file. So I’m going to double click and if you do so you will notice that there are these three bands sitting inside that initial vineyard file and we are going to bring all three of them together. And we can do so by clicking ctrl or shift button here like so. So you can see they’re all selected and now we can click ‘ok’ So you will notice that these values are now popping up in our contents pane here, three, two, and one. So if you’d like to explore them a little bit more, you can do so by clicking or unclicking them We don’t need to spend too much time on this. You can notice how some of them like, for example, red band one, where plants do absorb light, actually plants look a little bit darker than, for example, in the green band and so on. But what’s more interesting is how we can use these layers of information to compute a

single index And this is also practically more important. Rather than working with three layers at a time, can we compress this information into one layer, into one image, or one raster, that more meaningfully computes vegetation values and kind of strengths of something being vegetation The values here, you can notice they range from 0 to 55, this is an artifact of the fact that we’re dealing with what we call 8-bit data. In other words every pixel can only take maximum 256 values, from 0 to 255 in its brightness intensity. And if you don’t know much about 8-bit data, this simply comes, the way this, works is 2 to the power of the number. So 8-bit data would be 2 to the power of 8 would be the maximum number of values that the pixel can take And this allows us to have some variability in possible gray levels inside the pixel But what’s most important about these values is that they represent the strengths of that brightness, the strengths of reflectivity from each pixel, for red, green, or blue And so the lighter these values are, like on the roofs, you can see here in the buildings and the roads, the stronger is the reflectance. The darker they are, the lower it is And because different surfaces interact with light differently, this will be reflected in these brightness signatures, and that’s what our index is going to be taking advantage of as well To compute the index, we are going to use a very handy tool called raster calculator. It’s available through both image analyst and spatial analyst toolboxes So what we can do here is simply call for it from the tools. We’re gonna go to tab called ‘analysis’ here on the top above the ribbon And the easy way in ArcGIS Pro is to simply click on this tool button which brings up a search window called geoprocessing on the right And we can simply click there for ‘raster calculator.’ See kind of comes up from both image analyst and spatial analyst toolboxes. I’m going to move this a little bit to the side So we can just click on any of them, they’re going to be identical and the tool will load into this right pane Again it takes a little bit of time Okay here we go. So I would recommend expanding this just a little bit to help us see, we can always compress it back, but just to help us see kind of what are the different inputs that we’re using here So the formula that you had seen earlier on my slide and in your exercise, it’s basically a ratio. So this green chromatic coordinate, I’m going to call it GCC from now on, the GCC index is simply the ratio of the green band where plants reflect a little bit more than in the red and blue to the sum of reflectances in all three bands, both green and red and blue It’s a ratio. so we have to type in that formula in here using those different operators and tools And one thing I should say is that the raster calculator tends to be a little bit picky to syntax and annotations. So ideally we should try to use as much as possible the operators from here If some of them are not available, like parentheses, then we can use them from our keyboard So we can take a quick look. So we have in here in the rasters we have our layers and we have these different tools One other thing that’s important for this exercise is that, before we add our individual raster layers as inputs, we want to use an operator called float from the math menu. I’m just scrolling down here this little scroll menu you can see this float And the reason why we need to use float is because you can see this index is a ratio, which means that there might be decimal values in there, and the decimal values in the index will be something that you won’t see if the data are an integer And this particular index, you’ll notic,e also will take values from zero to one And the challenge is that if we let the software do the default job, it will simply produce either zeros or ones, depending on whether your index ends up being closer to zero or to one in its decimal value If we use float, it will preserve those decimals, it will allow us to avoid this problem

So in the numerator I’m just going to double click on this float tool You can see it kind of brings up this. I want to put my green band, which is band number two, so I’m gonna select it from my rasters left menu like so. Then I’m gonna go up and I’m gonna select the division, to divide, to create a ratio. It might produce this little error note here. We can ignore it because we haven’t finished adding our formula yet, it’s just the intermediate process I am now adding parentheses from my keyboard just to have them in place to make sure all of my denominator in that index fraction is kept together And now I’m going to add float value of each of these three bands, one, two, and three and sum them up. So I’m going to click on ‘float’, ‘band 1’, then I’m going to click on ‘plus’ Then I’m going to add again ‘float’ and then two, and then I’m going to click on ‘plus’, and finally I’m going to do again ‘float’ and band three. Okay so I’m double checking everything, it seems like it makes sense .I can delete this gap between the parentheses but we can also leave it The output raster, so by default if we click on this output line, it’s inside our geodatabase which is part of the project, the way ArcGIS Pro creates a project, this creates a new geodatabase So we would call this GCC just to be sure that we can recognize it later easily Again, double checking everything, and then we can click ‘run’. if you click ‘run’ and you get an error, oftentimes it happens because of some problem with the syntax. So before we do this, we’ll just review this again. Everything seems correct here. So I’m clicking run, the process is running, it might take slightly different time on different computers. Hopefully we can all get it to run. While it’s running I’m gonna very quickly look at the chat here Wonderful, okay Okay so you can see something has popped up already At this point we can fold this geoprocessing panel a little bit back or we can even close it if we want to If we’d like to see the index in its entirety, we can go to ‘contents’, right click on the index in the contents, and say ‘zoom to layer’ to bring us back to the full extent Right now, by default, it’s grayscale from light to dark and it’s not surprising because it’s just one layer now. So what we can do actually here is we can change its symbology to see the colors a little bit more clearly with respect to where we find vegetation and whether we have succeeded in getting closer to it with this index You can also notice the values range from zero to one, so it was a good call to use float operation otherwise we won’t see these nuances, everything will be either zero or one To change symbology we can simply right click on this index and select ‘symbology’ from the pop-up menu And it’s going to open in the place of geoprocessing And at this point we can keep pretty much everything as it is except a color scheme. So for color scheme we can choose some sort of two color palette, maybe with contrasting colors. In my example in the exercise, I have this one, but you can choose whichever one you like So this palette here is going to emphasize things that are not high in index values close to zero more like red and things that are closer to one more green When it loads like this you can see that there’s not a lot of contrast in vegetation areas and we can do so by playing with the stretch type a little bit more Sometimes using histogram equalize or standard deviation helps deal with this So I’m just going to use histogram equalize here. That’s a little bit better So you can now see green values. It’s not changing our data, it’s just showing kind of how the color, this color palette, is distributed across different values of the index And with this histogram equalize, it basically splits it a little bit more evenly between different data values, which helps us see plants kind of as green So I would recommend, at this point, unclicking or even removing your individual bands We can fold them in contents just by unclicking them so that we can compare our GCC to the original image a little bit more easily. We can zoom in right, and we can see that indeed the green color, which we chose here kind of

intuitively for visualization, does tend to be darker and denser in areas where we have vines, trees, and some other green vegetation. So water, unfortunately, also looks pretty green in this image. It could have algae or maybe chlorophyll in the water Even in the visual image, it looks pretty green This lawn here is also very distinctively green so we can we can see it here so that this index does pick up vegetation with its high values pretty well But as you keep zooming you might also notice some other really interesting feature here. If we zoom into a tree, and I like this tree as an example, so just kind of zoom in here. If I unclick my GCC for a second, one thing that’s interesting about this index is you can see how in the original image, the tree has shadow here on the left and it’s kind of hard to see the boundary between that shadow and the main tree The index seems to be dealing with this quite well Where the shadow is on the ground it’s very non-green, it’s something totally opposite right and we can see the outline of the stream more clearly from here But the other thing that we can see, both from the original image and from this index that we calculated for individual pixels or little grid cells in the image, is that there is a lot of heterogeneity If we zoom in even further, we can see that there is a lot of speckle inside this tree crown of this GCC values where sometimes they’re higher, sometimes they’re a little bit lower, sometimes they’re quite low inside the tree, like here in this gap and that is not surprising because this tree does have some gaps in the canopy. It might have some roughness in its upper part where some leaves are very sunlit and they appear brighter, lighter and some leaves are in the shadow so they might not even look much more different from non-vegetated branches or bark or something like that And you can start appreciating this challenge of mapping something from UAV and drone data. It’s a real blessing to have such amazing spatial detail that we can visually recognize a lot of features on the landscape and do a visual check on our mapping outcomes, but at the same time, for mapping purposes, for detection, we are often not that interested in all this fine level spectral noise We really would prefer to see the tree mapped just as a tree, as a single feature maybe with some gaps if they’re really big, but this sort of heterogeneity would make it very difficult to map it. Right how where do we draw the boundary, where do we say that this noise is part of the tree versus not And this is not necessarily a flaw of this index, the index is still affecting effective at finding areas that are kind of green vegetation and differentiating them from everything else, but it’s the issue with the pixel size. Our five centimeter pixels are very very small and it makes it very difficult to really assign them into objects or elements of the landscape, like trees or vineyards, if we only work with those teeny tiny pixels as mapping units Because they would often have a neighbor that is a completely different color completely different shade darker, lighter just because the data picks up all this detail with this very high spatial resolution of the imagery And this is why, in the subsequent part of our exercise, I would like to introduce you to another suite of techniques that try to circumvent this problem They can still allow us to work with image spectral data in the original channels with indices like GCC, but we can be a little bit more strategic about navigating this noise and complexity. And we will use a suite of techniques called image segmentation. Iit’s a general approach, it can be done in multiple different ways and I’ll show you how we can do it conveniently in ArcGIS Pro and what logic goes into it. So what we’re doing here is we’re moving to page 9 of our exercise and it’s part 4.0 called segmentation. One thing that I would say we should also do at this point is save our work. It’s a good idea to do so repeatedly through the exercise to avoid losing it This type of index calculation is fairly easy to recreate, but some of the other steps take longer, so saving our project after every step is kind of a good idea here Just quickly finding, so some sometimes you might not find the exact color scheme Just again let me show you if you, if you really want to follow very closely what we’re doing here,

if you scroll all the way to the bottom, so in this color scheme that comes by default, if you scroll all the way down to the very bottom this one is one, two, three, four, five, six, I believe it’s seven or eight from the bottom up. It’s called red, yellow, green contiguous. And then here in this example I’m using histogram equalized stretching but we could also use standard deviation or percent clip on some other. This one was a little bit easier for us to see these immediate outcomes Okay so we will leave our index as is but we will now try image segmentation, which is a process of delineating regions in the image. And to do so I’m going to close my symbology tab here on the right pane. And I’m going to go again to my analysis tools here, which brings me back to the search for tools, and I will click segmentation. So you can see this ‘segment mean shift.’ This is the name of the tool ArcGIS, both Desktop and Pro, use for this particular process. It’s also available through two toolboxes, image analyst and spatial analyst, both And we can choose any one of them. And it asks us for a couple of options of what we would like to consider when we delineate these little regions And this is where I would like to spend just a little bit of time to develop a logic for our vegetation mapping exercise. I’m going to zoom out my data a little bit to see sort of more of a landscape We’re dealing here with an interesting project right. We have a landscape that has quite a bit of vegetation in it, but this vegetation comes in different sizes, in different shapes, in somewhat different colors When people develop segments, basically the outcome would be these little regions in the image that are like separate objects And we can either assign them to certain classes like trees, roads, buildings, or we can use them as what we call prototype mapping units. In other words, we can use them to merge them into larger patches of different cover types And the second strategy where we develop smaller regions first and then we assign them to classes of interest, like vegetation or non-vegetated surfaces, it’s actually more recommended in landscapes that are a little bit more complex because if we try to get all our objects, all our elements, right away exactly correct in their outlines from the step, we will spend a lot of time and we will end up being very frustrated because it’s very difficult to achieve it in a single run, and often not possible in a landscape like this And that is because a lot of these tools they work with pixel data in these red, green, blue channels to try to find what might look like a little region, depending on the criteria we specify. But because our goal is to reproduce regions of very different sizes, very different shapes, very different levels of internal complexity, for example row that are more smooth in color versus trees as we have just discussed a more heterogeneous, it’s very difficult to sort of set that all up in one step So our strategy here is to select this segmentation parameters so that we will end up having reasonably decent prototype objects or prototype segments By the way these terms objects, segments, regions, they’re often synonymous and people use them interchangeably in this type of segmentation analysis. So I’m going to try to stick to the words regions or segments, but if I accidentally say objects, it basically means the same thing And here’s how ArcGIS Pro allows us to approach this process. It has three parameters called spectral detail, spatial detail, and minimum segment size. And these three parameters together, plus the selection of band indices, which are basically what layers from the image we want to use, will help us determine this outcome. And the way these values are designated, specifically in this software, is that spatial and spectral detail can only take the values from 0 to 20 And I’ll explain shortly what that means, while the minimum segment size in pixels is how many pixels we want to have in the smallest objects. In other words, the object should not be smaller than certain number of pixels So first let’s specify our input raster What we want to do here is to run

segmentation on our original image And so we would want to select our three red, green, blue bands, but it’s we can’t select them separately here so we should select the original image like so. For the output data set, again by default you can see it places it in our project folder, we can name it something else. I actually I would actually hold down on this just a second until we specify parameters. I really like to code parameter values in the name of the file so that if we do this multiple times we can keep track and we know exactly what parameters were used in each index derivation. So let’s just pause here for a second. For band indexes at the very bottom, our image only has three bands, so we might just use all three. So I’m just gonna put one, two, and three And it will tell the software to just use all three channels, all three layers from from the original image Spectral detail. So these interesting parameters now, they’re not immediately intuitive. But spectral detail, think of it as a weight. It basically tells you how much weight do you want to put on the color differences among various things you might be interested in mapping And the thing here is that we know that color is pretty important for differentiating vegetation from other things in this landscape We don’t have any artificial turf here as far as we can say or we basically, everything that’s green is vegetation, so we do want this parameter to be fairly high And again it’s possible to put a value between 0 and 20 But we also notice that sometimes inside vegetated features, like trees, there could be pixels that are completely of different color and tone, but they actually should be ideally in the same region And that we can help account for with the parameter called spatial detail as well So spatial detail is a parameter that kind of tells you how much you would prioritize the proximity when you delineate your regions. And essentially what it means is that if you have pixels that are pretty different, but they ideally should belong to the same type of entity, like within the trees where you have sun lit and shaded, it’s a good idea to set that parameter higher because that would help kind of create a little bit more sort of compact and puts a little bit more emphasis on this proximity that would pull those pixels together in the resulting objects So we can play with these. It’s always a good idea to actually play with a number of different values before deciding what’s best for your project, but for the sake of the exercise, we’re just gonna set the spatial detail parameter to maximum, to 20 And we would say spectral detail high but a little bit lower because again we could have instances where we could have very different colored pixels or shaded pixels in the same type of entity like a tree So I’m choosing here 15 and 20 And then for the minimum segment size in pixels. This is a tricky one because to decide how small your smallest regions should be, you have to consider two things. One is what is the target you’re mapping? Are you likely to end up with a lot of really teeny tiny clumps of vegetation in your image, is that something you’re looking for? Or would you rather prefer just delineating broader clumps of vegetation, emphasizing most pronounced patches of plant cover. This could really vary by application. You can imagine that somebody for instance studying insect habitat might really be interested in even detecting little clumps of green grass wherever they occur through the image, but some more general assessment of vineyards or mapping trees might primarily emphasize these large entities on the landscape Another big factor in consideration here is spatial resolution. So the second criteria is how detailed your image is. Here we have five centimeter pixels For most entities that we would like to map here, like trees or vineyards, five centimeters is a very small part of that. So really we would not want to have objects with very few pixels in them because they would likely be just little patches of noise or spectral variation. So I would suggest setting this value for this kind of image high In my example I use 600. Potentially could be even higher depending on again what exactly is the target. If we largely were interested in trees or vineyards, and even though vineyards here the the vines are kind of narrow, their rows, but they’re long and they stick together so there’s still a lot of pixels in everything that we would call sort of a patch of of the vine here. So now that we’ve set our three parameters I would change the name of my

output data set to reflect that. And I would call this seg for segmentation and I would put them in the same order 15, 20, 600. It’s always a good idea to keep track of it. This is totally up to you, how you name the files, but I found that it’s kind of a useful practice to help me keep track of this. So we have 15, 20, 600 and we have our three band indices, and we’re ready to hit run. I should tell you that this is kind of a process that would take a little while on most computers, so it’s a good time for us to take a short bio break. Before we do that if you come back after five minutes and your software is still running and it looks like there’s no end to it, you can load the output of this from your files that we provided with this exercise from the folder that had the outputs. We have a segmentation output there so if this takes too long, please do not be frustrated, just add data from that folder into here and you will be able to see the segmentation image I’m going to hit run just to make sure it starts running here and right now while it’s running, we’re gonna take a five minute break and I’m also happy to take questions I’ll be here during the break so please feel free to use chat or speak up or if you’d like to clarify anything from the earlier steps, you’re most welcome to do so now Well thank you for wonderful questions. I think now it’s a good time to end our bio break and start continuing with the process So yes I will stay a little bit longer after the session for a few more minutes and or you could also, most you would be most welcome to contact me by email, and if it takes longer or if you’d like to follow up and have questions about these tools, my email was on the slides. And you can also just look me up on the UC Berkeley directory. So moving on further, I hope your segmentation worked and did not create any frustrations for you If it did again, please go to the package that you unzipped from today’s exercise and you will find the output called seg152600 there and you can load it into your workspace. I’m gonna remove this, reduce a little bit my geoprocessing pane, and zoom back to my segmentation image So you should have now your segmentation image on top of your content screen And we see this as a grayscale image, which it’s a little bit counterintuitive at first, because really we’re looking for regions here right. What are the regions here? If we zoom in we will notice that there are these clumps of different gray scale values and they sort of look like little regions there, so they sort of kind of seems like it did something What does the value mean? So the value here ranges from, in my case on my screen, from 23 to 252 Sometimes for you it could be maybe from 1 to 254, it’s somewhere within that 0 to 255 range And it simply is what that is. The brightness value in the range between 0 to 255 This value is not really useful for us at all, unfortunately. I’m not sure why the software kind of codes it that way, I think it has to do with the 8-bit data it’s just computationally easier to do it But what we really hope to get here is little regions within the image that we can use to extract information, to smooth local noise. if we are interested in working with our GCC index for instance we would like to have that GCC value computed for each little region right And to do so, that little region has to be unique In other words, we would like that these objects, these segment,s these little zones in the image, they all have unique ids whenever they represent a discrete feature. Now what happens with 8-bit data is because we can only take values from 0 to 255, in other words 256 possible values, this will be a problem if our image produced objects through segmentation, the number of which is greater than that and oftentimes that would be the case even in a relatively small spatial extent. So the risk here, in other words, is that the values shown here in the value field, they’re not going to be unique and if later let’s say we want to calculate

mean GCC for each index, which we’ll try one of the next steps, we can’t do a unique objects mean GCC because it’s going to grab whatever else little regions and zones have that same grayscale code here and lump them all together, even if they’re not spatially connected into one category. We don’t want that to happen here So this is why we are able to have a workaround And actually what we will do to circumvent this problem is we will take the segmentation image that the software produces by default and we will convert it into a polygon vector layer. And the little polygons will have the same boundaries as these zones produced by segmentation, but they will end up having unique id And then we can subset our GCC or our original image bands or anything else we like using that unique id as a zone and produce the little region specific values of greenness or anything else that we would like. So without further ado, let’s see how this works. And to do so we need to use a tool called conversion, which is basically the converting raster to polygon in this case. Again it makes it very easy to do so in ArcGIS Pro by just using analysis tools and we can say ‘raster to polygon’ but it’s in the conversion tools so if you if you’re more comfortable by navigating the toolbox, you would find it in the conversion tools. So we’re going to click on this tool and here what we’re going to do is, as our input raster, we’re going to use our segmentation outcome that was produced. And the field here it says value. I just said that this value is not very meaningful to us, but actually it really doesn’t matter for this conversion because all we need to do is to convert discrete instances of any value into its own discrete polygon So it doesn’t matter if we use this value. In fact, there is another option here called object id, but it doesn’t work for me when I try it so let’s just leave value at default. For output polygon features, we can save them also preserving some of their segmentation names. So I would also call it seg1520600 But I would also add some unique name to it. What happens is here you can see, the way I’m saving it, is I’m going to save it as a feature class outside of the geodatabase So it’s not going to be in the same folder presumably as where my input raster is And in your exercise screenshot that I have, I actually kept the name identical to the input raster. But actually I would suggest it’s a good idea to add something like a v for vector or something else just to make that name unique to avoid conflict. Sometimes you might run into error, especially if you try to save this vector layer inside your geodatabase, it will not tolerate that. So let’s just add another letter here to make it unique And here, this is important I do not want to simplify polygons You could do this if this was like your final outcome and you wanted more smooth boundaries of these various objects and regions just for mapping or visualization purposes. But in that case I actually want to preserve this irregularity because I know in my original image some of these vines have very irregular fuzzy looking boundaries that are kind of meaningful that way so I don’t want to simplify them I want to preserve them to align kind of with the pixels from which they were produced And another box here says ‘create multi-part features.’ I would actually not recommend that here as well because multi-part feature is essentially, if you have something that consists of several discrete polygons butit’s treated as one region In other words, if you have from this original value, the grayscale value, let’s say you have a hundred different little regions that all got assigned a value of 252. So they will be all treated as one unit and when you calculate the statistics, they will be all used in that statistic calculation But that’s actually not what we’re trying to do with this conversion, we’re trying to disaggregate them in a way and make sure each little object, each little region is its own zone So that’s why we do not want to create multi-part features Maximum vertices for polygon feature. We should not keep that either or specify, we can leave it blank because it’s really hard to tell with these objects how many vertices we want and there’s we would just like to preserve the image raster regions in the vector polygon format So at this point I can click run It’s running, it might take also a little bit of time

but hopefully not too long. And the result of this would be a vector layer So we are kind of moving now to our page 12 here while this is running just to orient ourselves There are a lot of little regions. Even though we set maximum size to 600, that’s still not that much for the image with five centimeters spatial resolution. So it might take a little bit of time. so it’s done for me now What you can see immediately is something that looks like a polygon shape file which is what it is here We can change symbology of this polygon shapefile to help us a little bit kind of assess where these little regions fall in place Before we do it, let’s just do a very quick check So in this polygon shapefile in your contents right click and select attribute table. So because we’ve converted this from raster to vector format, we now immediately get a table of different attributes that are initially available for each little region. And you can see that now we have this field called id and if we scroll down that id actually is unique. And this is why we were doing this So now rather than using this grid code, which was the grayscale value limited by the 8-bit nature of the data in the segmentation process, we now have unique ids that actually don’t have a limit and they go higher and higher. And maximum number you can see 43,480 that’s how many objects or regions or segments we have produced for this image and each of them has a unique label So that would allow us to use that label to calculate their parameters, like mean GCC value or something else in the next steps. I’m going to close attribute table just by clicking this little ‘x’ next to the table name here I now suggest that we change the symbology a little bit. If you’d like to see how these regions overlay against either GCC or your original image, it’s good to make them transparent. So we can just click here to modify symbol and you can see the symbology pops up here. Or we alternatively can say right click, symbology, as we did before with raster layers. And so for the single symbol, what I would suggest that we do is we can just make them something like this black outline here, ,which is the first one that comes up in the menu which will make them hollow with black boundaries They look identical to your segmentation output which is not surprising because that’s what they were produced from But I’m actually going to unclick this segmentation grayscale image that’s second from the top and look at how they compare against my original image. And you can see this they’re fairly irregular in shape, but these little outlines do seem to cling pretty well in a lot of places to patches of bare soil, vegetative features, even shadows. We can also, because shadows are fairly dark, we can change this outline color from black to something else. I can go back to the symbology tab on the right and go to properties and in this properties, you can actually, here in the top menu, you can see you can change what it’s called outline color I’m gonna make this one dark blue for instance, just it’s a matter of whatever preference you might have here. And click apply. So you can see now a little bit more clearly where these regions fall So you can see they’re fairly irregular If you wanted more regularity for your purposes in the image, one way to achieve this would be by setting the spectral value parameter. When we were specifying the segmentation outputs in the earlier step, if you set the spectral spectral detail lower, they would be a little bit less irregular That doesn’t matter for us here necessarily because we have seen that some of these vegetative features are irregular in shape And some of the trees, let’s scroll down and see what happens here, so you can see actually in some cases certain trees got almost identified as a single object which is pretty cool and very much aligns with our goals But larger trees often get split into more than one region We call this over segmentation, when an object that ideally would like to see as a whole gets split, we call it over segmentation But just to reiterate as I said earlier, that is a better problem to have for our purposes because later we can extract some characteristics of these regions, like their GCC value for instance, and use it to merge them into vegetative patches. So we can reconstruct larger features later by grouping

these regions. But what they initially allow us to do is to address this noise. Look at this example here for instance. This tree is pretty heterogeneous This little portion of the crown on the original image I can also click GCC and you see there’s a variety of different color, kind of even though this we use a stretched color bar, darker green, lighter green But actually it ends up being delineated here as one region, and it does kind of absorb some of this local variation here into one unit and would allow us to differentiate it You can notice here shadows also got assigned into separate groups. That’s a very neat technique also for dealing with shadows It’s not our main topic today, but if you were interested in removing shadows or kind of identifying them as a separate class in your mapping analysis, segmentation is the very convenient way to do so because shadows are so different in color but they can be quite irregular. And in some cases, even in the vineyards, they can be picked up although not necessarily ideally with this set of parameters Anyway. So we can also go back to our purple looking trees and see that they also got over segmented and separated somewhat from their shadows here And one thing that I didn’t mention earlier when we were looking at the pixel-based GCC, so I’m gonna highlight it here, is that these purple trees actually had a fairly low GCC value compared to the trees that are looking green You can see the pixel level that I have here just clicked underneath, they look more like roads and they also look more like this non-vegetated surfaces And that is because they’re not really green. They’re not really green, and even though they’re healthy trees, they’re purple so this greenish index, at least at the pixel level, didn’t pick them up very well. So we will now try to calculate our GCC at the level of these regions, at the level of these objects, so that every little unit has now its own value of GCC. And we’ll start with the mean value so we would get a sense of sort of what is the average greenness based on this GCC index for every little polygon So we will do so using the zonal tool in ArcGIS And this is, we’re moving now from page 13 to page 14 Step 4. 11 So again, it’s very convenient to go to analysis tools and search for ‘zonal’ So we want here is a tool called ‘zoonal statistics.’ So I can also do zonal statistics. So again it’s available through both image analyst and spatial analyst toolboxes, it’s the same So we’re just going to click on it. So what we’re going to do now is use the unique id that we have established for each object after we’ve converted them from a raster to polygon as the identifier, and we will select all the pixels from the GCC image that correspond to object with that id and calculate their mean. So in this tool here zonal statistics, we will select input raster as our… Actually, so input raster feature zone data, that is actually our polygon What we want to do is use our vector layer to tell the unique ids as zones in this area for which we want to calculate means for these little regions. Sone field id, yes that’s exactly the field we want to use. It’s the individual id of each little region. Input value raster, this is the input of what is it that we’re going to summarize for different zones, and in this case it’s going to be GCC We’re going to try and calculate mean greenness for each zone Snd so the output raster I would just call it something like GCC mean. It’s again a good idea to have the name reflect our sort of specific statistics that we’re trying to produce from this analysis. So I’m going to call it GCC mean And then the statistic type. If we look at this pull-down menu, there’s a few very standard stats like means, maximum, minimum, range, standard deviation, sum. So we’re going to use mean now and we’re going to keep this box checked Ignore no data in calculations And what this is basically doing is, if there any holes in the image, this data shouldn’t have them but sometimes you have it from previous masking or some kind of pre-processing, it’ll just not consider them at all, only work with pixels that have value just for convenience. And then we hit

run So it’ll also take a little while, but what this would produce is a zonal image where we also have regions, similar to what we had in the original segmentation outcome, but now the values of pixels correspond to the mean GCC of the little region from which they belong So this got loaded here right under my vector layer, this GCC mean you can see in my table of contents I’m going to not close my geoprocessing just yet, just sort of remove it a little bit to the right here I’m going to right click and say zoom to layer A lot of detail when we zoom because our regions are small And I actually would like to change the color palette of this GCC mean to match the original GCC Now you can see the values range from 0 to 0.6 It’s less than 0 to 1, there is no particular reason why. It simply means that once we’ve averaged individual regions, we don’t get as high values of GCC as some of the individual pixels did before. But it’s still… the range of the values for the index is the same as the possible range for the original pixel based index. But we will, now I’m going to unclick the polygons just for a second, and I’m going to right click on this GCC mean and symbology to put the same color palette as I used for my pixel image, just for comparison Actually it might be a good idea to pull this back a little bit, stretch a little bit more So again it’s this one, going to the bottom and then going up it’s about eight from the model, the red, yellow, green one and I’m also gonna use histogram equalize, the same as I used for my GCC before. If you use a different one you could choose a different one as well So this is what we see here now. So we can scroll in and out, we can see that now this index is not as heterogeneous inside trees and other places because it’s now following this structure of little regions here We also can see that a lot of features that are supposed to be green also sort of stand out with higher values of GCC for their regions. But also these notorious purple trees are still not green They are purple and this particular green chromatic coordinate index, probably due to their low reflectance in the green value, they don’t really stand out as do the other healthy trees that are actually green and so if we’re facing an issue like this right. We can have vegetation that comes in very different colors and shades in human landscapes with ornamental plants, but also sometimes in the wild landscapes like in the deserts or chaparral or shrubland, where not everything that counts as vegetation is necessarily green So one of the ways to get around this is to look beyond the absolute magnitude of greenness And this is especially convenient when we work with segmentation results, with objects So I want to close my symbology here and go back to geoprocessing under it Now we can calculate another statistic for our regions, but this is going to be a statistic called standard deviation And standard deviation, if you’re not familiar with this, is simply a measure of how values are dispersed around the mean It’s essentially a measure of heterogeneity variability. It’s not going to tell us how green the object is, but rather how variable is the object or the region in terms of its greenness. So what I’m going to do now is I’m going to keep my input raster the same because it’s the zone,, we’re still going to do it for each little region and for its id. I’m still going to keep my input value as GCC. But now I’m going to calculate standard deviation of GCC. And I will adjust my output file name accordingly. So I’ll just put std for example here and I would still ignore my node data and calculations. So now I’m gonna run this and, just to so the error message got removed because I changed the name Now I’m gonna run this to produce a measure of standard deviation for every little object that I have, and we’ll see how that might or might not help us with the vegetation delineation in just a second as it’s gonna finish running So we don’t have to, when this region loads, we actually don’t have to use the same color palette anymore because we’re not looking at high and low greenness. We’re looking at high and low variability, so we could either keep it as a grayscale or, just maybe to make it a little bit more convenient, I could still go to symbology and change the symbol to

some sort of binary scheme if I wanted to see what what is more heterogeneous versus less. So I’m just going to use a slightly different, maybe red to blue palette just to not to have it confused, where blue would be higher heterogeneity, red would be lower heterogeneity or standard deviation. And I’m going to choose also a histogram equalize here. So if we look at this image now you’ll notice something pretty interesting is that areas that are more heterogeneous definitely stand out because they have now higher standard deviation of those little pixel values of greenness inside these regions And actually now if I look at the trees, and I’m going to remove that mean from the view and just click on this GCC standard deviation my contents versus the original image, now these purple trees do look like trees. And they sort of very similar because they’re similarly heterogeneous They have similar patterns of light and dark in their canopies and, while absolute value of greenness didn’t capture them because they’re purple and they’re not as green, the variability is similar. And that means that we can use something like the standard deviation to help us further identify vegetation or refine our classification if we get stumbled upon such an issue. You can also notice that wines, they’re a little bit like trees, they also have some local heterogeneity, they’re woody, they’re a little bit irregular in canopies, they also tend to have somewhat high heterogeneity compared to bare soil among them, and that becomes another way of helping to distinguish them if for any reason color was not enough or mean GCC value So this is all fine but how do we go from such an index to a very basic map So what we’re going to do now is add one more step, and to keep this straight forward, we will only do this for GCC mean recognizing that it does have a limitation of not picking up purple trees but otherwise it seems kind of reasonable for vegetation And we will do a very simple step where we will convert this GCC mean image into a binary image. Basically you can think of it as a mask or a simple vegetation cover map that, rather than dealing with this continuum of GCC values, it will simply tell us is this vegetation or not based on some criteria And the simplest way to do it is using a threshold Now to determine a threshold we would have to look into a little bit into how these values are distributed because a thresh would basically be a way of saying ‘if some values of GCC mean are greater than this, then we’re going to count them as vegetation. If not, then we’re not going to count them as vegetation.’ And oftentimes it’s a very difficult thing to find a single threshold that satisfies the whole image, It’s a little bit easier to do it in this type of environment where plants have a good contrast with other surfaces. It’s also a little bit easier when the image is smaller But just to show you how this works, what we’re going to do is we’re going to right click on our GCC mean now. And you can see if you have the symbology open it automatically switch to that. If not we can click on its symbology from the right click in the contents and choose symbology And we would come here right and in here, rather than using this primary symbology, rather than using stretch, which was simply a way of visualizing things, we will now say ‘classify.’ So actually we would go to this little classify menu This is still a way of visualizing but it will help us find a threshold through this fairly simple visualization exercise that then we can use to formally convert this image into a vegetation cover map And so here we only want two classes, vegetated and non-vegetated. So I’m gonna change the number of classes to two from five. And the method called natural breaks is a very convenient method that looks for sort of natural changes in the value range and distribution. And in this kind of image, we would expect that a lot of pixels will be like vegetation and a lot of pixels will be not like vegetation, so there should be a fairly good natural contrast And so when we do this you can see that here down below we only have two classes, yellow and red, and they already have a threshold that the software had identified as a potential kind of separation of these two classes So to see this more easily here, I will click on this color on this yellow polygon, and actually make that no color for the low values And what that would do is if I have in my map here, my original image under this

GCC mean image and everything in between unclicked, if you don’t have it you can take a second and unclick, you see where this red which corresponds to values that are higher than a certain threshold where they fall in this image And this is kind of a little bit confusing the way it’s set up in the in the pro menu, but what this means is all the values that are greater than this eight 0.34582 are actually going to be coated red. And you can see right away whether this is enough or not It seems like it’s missing a little bit of some vegetation in the vineyard so maybe we could use a slightly lower value like 0.33 Another thing we can do is we can click on the original GCC image here or on the GCC mean And simply click using the pop-up window to get a sense of what the values were in vegetated areas and not. So I’m going to zoom a little closer. If I click for example in the vine here that was missed, it’s also 0.34 but it’s a little bit less than 0.345 This particular index has a really interesting kind of nuanced range of values based on the ratio of green to green, red, and blue So what I’m going to do now is I’m going to take a bit more a slightly lower threshold to allow for more some of these missed clumps to be included and take a value like 0.34. I could also change it right here or I could switch this to manual interval as well If you want to try that and that way this becomes editable and I can set this value to just 0.34. So now you can see if I do this, so what I did here is switch the method to manual interval and then I edited this upper value in the top box to 0.34 without further digits after that second decimal, so now a little bit more is included And I can zoom out. So we’re going to use this as a mask for now to try and develop a vegetation cover map And to do so we can use again one tool called reclassify So if we go back to analysis tools, here in geoprocessing I’m going to type ‘reclassify.’ And it’s also available from two different toolboxes 3d analyst and spatial analyst, it’s the same tool again. So what we’re going to do now is convert this GCC mean image that had values in the range from 0 to 0.6 into a binary image that only has value 1 if it’s vegetation and 0 if it’s a non-vegetated area based on this threshold 0.34 that we have identified And you can, in your own exercises or your own applications of this, you can always play a little bit more with these threshold values and sort of find something that suits best a particular study area Different indices, by the way might have different threshold values too For GCC we found this value for this image, but in some other type of index or in some other type of image that really could be different as well So what we’re classifying here is our GCC mean image, so we’re going to use that as an input Reclass field value, that makes sense because the value here is the value of GCC that we’re going to use as a basis for this thresholding For output raster we can call it some some sort of reclassification output. I’m just going to call it ‘GCC green features’ or you could call it ‘GCC veg’ So we’ve modified that. And so then to actually enforce this threshold, we’re going to click in this reclassification If you double click it’s going to start allowing you to edit these fields So what we’re going to say is that everything that starts with the zero and ends with point oh sorry or 0.34, the new value will be zero. We want everything low to be zero, non-vegetated. Okay And then I click on this ‘new’ here on the zero and I hit enter And that’s gonna give me another row in this little table And now I’m gonna say that everything that is from 0.34, and I’m just gonna add a few zeros and 1 here just like a super tiny next value, and to maximum. So I’m gonna use maximum of one from the original GCC raster, but you could also use 0.6 from the raster that we produced after getting GCC mean for the polygons, whatever is the maximum value. So this index is only bound by zero and one so it makes sense to use either one of them here. And then the new value would be 1 So that would be a different code for

vegetation Change missing values to no data, we could certainly do that. This is optional it’s not, we don’t have a lot of missing values or no data here so we don’t have to do this, or if you wanted to you could keep this checked And you can also see that it actually does give you no data whatever was no data in the image will be preserved as no data by default, and we would like to keep that as well Usually this is when the image is not regular rectangle, but has sort of no data value outside of the boundary. If it has a regular overall spatial extent, it’s a good idea to keep that so it stays as no data So now we’re ready to hit run, I’m going to hit run, and what this will produce is a binary raster where everything that has a value of 0 corresponds to what we’ve decided to treat as non-vegetated features I’m gonna slide this over to see alittle bit better And then vegetation i has a value of one. We can save the symbology, we can change symbology if we wanted to modify it, we can export it, we can preserve it within the file, we can sort of do whatever we would like to do with this I’m not gonna do expert just yet, we will do it for the very last step of today’s workshop where we’d have some something similar from a different workflow or step, but I’m going to save my work now again And zoom to layer. So this is a very simple way of converting these indices produced, either from raster manipulation of individual pixels or from segmentation with extraction of mean. We could do the same for standard deviation if we wanted to, and you now have these tools to explore them further in your own work if you would like But now in the last step of today’s workshop I would like to show you a slightly more complex workflow that also leads us to a similar result, but provides a little bit more flexibility and nuances kind of in between. And ArcGIS Pro, as I said, is pretty much the only currently popular software that has that routine built in this way And it basically allows us to combine the process of segmentation with the automated search for regions and groupings in the data to produce classes on the landscape and to try to identify which of those classes correspond to vegetation. To try this workflow, what I suggest we do is we unclick and remove a lot of intermediates that we have accumulated here in the contents. Really what we need for this analysis is just the input image So feel free to either unclick them all or, you can also fold them and then you can hold the shift button on your keyboard to select multiple of them, and then right click and say ‘remove’. So if you’d like to do this, second from the top, we can remove a lot of them, just preserving our image. I’m also going to close this geoprocessing window for now This tool is not in the toolbox. It’s a separate interesting workflow that you can find on the imagery tab here It’s called classification wizard and right now you can see it’s grayed out. So we are now, just to orient ourselves, on the page 17, it’s the section 5.0 in the exercise And the reason it’s grayed out is because we see in the contents we have our map highlighted but not the actual image So if we highlight the image, now you can see this classification wizard becomes activated And we can click on it. So this wizard is basically a built-in workflow that allows you to choose how you want to run your classification and what you want to use as mapping units, individual pixels, or image regions So for us we’re gonna click on it here I’m gonna close the symbology thing to get it out of the view. You can see here it’s a multi-step, there are these little kind of balloon circles, they’re different steps of the process The beauty of this tool is that it allows you to kind of wiggle between these different steps, manipulate them on the fly, and then only see the final result that you seem to be most satisfied with We’re gonna do a one kind of run through of this tool, but we will discuss each step so that you would know how you could apply it to other instances of applications you might be using this in The first step here is classification method choice And the options if you look here are supervised or unsupervised

Those are very two very broad strategies in image analysis, and let me just say that supervised requires training data, what we call or some prior information about what are the different classes in your landscape, what are their locations, that the algorithm can grab a samples, learn from them, and then assign something about them to everything else in the landscape to try to recognize other unmapped, unlabeled features from these examples It’s a more complex process that involves a whole kind of logistics of selecting training data that we’re not including in today’s exercise. But oftentimes, in analysis like this where we don’t yet know our landscape, maybe we haven’t visited it yet and we want to explore the possibilities of detecting vegetation without the extra burden and cost of bringing in those training samples, the unsupervised technique actually can work quite well Requires a little bit more interaction with the software and a little bit more direction on landscape visually, but it’s a very convenient exploratory technique And you’ll see it’s actually pretty well suited for the routine that we’re going to train and do today for vegetation delineation And in part, because we can recognize vegetation so well from the landscape thanks to the high resolution and great detail in drone images. The next step here is classification type object based or pixel based. Because we have this high resolution data with a lot of heterogeneity inside features that we’re trying to map and, as we’ve seen before, it makes a little bit more sense to work with image regions as mapping units that absorb some of that local noise, we’re going to choose object based and use our now familiar segmentation technique as part of this workflow to help delineate mapping units on the fly through segmentation and then classify them into groups And this is exactly what the unsupervised technique is doing. Because we don’t provide prior examples of classes, it’s actually going to try and find groups in the image, groups of objects in this case with object-based classification, that look more similar to each other than to everything else in the landscape And then it’s our job as an expert here to go through these groups and decide which of them are likely to be vegetation or not, and then create a final cover map from that So in getting us ready for that, we also have to specify classification schema This is simply a sort of like a set of predefined classes or categories and we can either create our own, sometimes people do it when they have do supervised mapping, or use a default We can just choose default here which comes from a standard national land cover classification, because we’re only going to do two very basic classes. It really doesn’t matter for this particular exercise Output location, double checking it’s inside our project, that’s fine for now. It’s just in our folder We can leave these optional parameters here blank, but what they are basically, if you wanted to already, you’ve already developed some kind of segmentation before and you really like that segmentation, you could incorporate it here. Instead of running segmentation within this wizard workflow, you could use that instead. We’re not going to do that, we’re just going to reproduce segmentation here on the fly. And then the reference data set is, if you were doing a supervised classification, you might need examples of the samples Again, this doesn’t apply to us here. So we’re ready to click next Here is our now familiar segmentation workflow with only three parameters spectral, detail, spatial detail, and minimum segment size So I suggest that, for the sake of today’s exercise, we use the same ones as we used before But I want to tell you that this is a great wizard workflow where you can actually play with different ones and fine-tune them here. Before you produce your final map, before you produce your final outcome, you can go with this previous and next and see whether these regions even, like maybe they’re too small right away that’s way too over segmented, or maybe they’re too large and it’s just not going to work because they’re already lumping vegetation with something else So this is a good place where you could explore it. And you can, if you wanted to only think about boundaries, you could check this show segment boundaries only You’ll see what this does for you in the image. We’re going to zoom in a little bit If we go to next here After a while it might actually even show up, it might show us these region boundaries here by default in black. You can see here there’s nothing in the contents it’s not producing a segmentation layer, but it shows us what the software would see in this workflow as region boundaries And you can see it’s actually quite neat here and corresponds to vegetation clumps pretty well If we didn’t like this we would go to ‘previous’ and play with those parameters a little bit more, but I’m going to just

keep it as is for now And this next menu that we have clicked and where we are now is where we specify the parameters of unsupervised classification grouping of our regions into classes. And classes here is a very loose word because it’s basically, it’s not your final class, it’s not the vegetated versus not, it’s like these prototype groups that we will later label and decide whether they are vegetation or not before reclassifying them into a final map And so technically they’re called clusters. So this sort of isocluster is one of the methods It’s a very popular iterative algorithm that goes through our data and finds what what regions would look more like each other based on their values for the image bands, since we’ve specified our input image as an input dataset. Maximum number of classes here Again, this is not the final number of classes, which in our exercises two, vegetated and non-vegetated, but this is like a class that would be candidates for vegetation and not vegetation And it’s always a good idea to have a slightly more of these initial maximum number of classes than the actual classes you need Because if some of your target cover types are very heterogeneous, you might actually have several different groups that you will label as that class leader and merge, rather than having the software be kind of stumbled upon trying to find a group where something is extremely extremely heterogeneous So because we’re only mapping two classes, for this exercise I’ll leave this as a default five, but generally it’s a good idea to leave a little bit more room. If you were mapping like let’s say three or four or five classes you might use something between like 12 and 20 The more you have the more you have, to label later so that becomes a little tedious and there is a time cost to it. Maximum number of iterations is how many times the software will go through the groups to reevaluate how tight they are, how similar they are to each other and different from the rest of the landscape. If you choose this number to be very small, it may not arrive at the best grouping yet But if you choose it to be very large, it’ll take a while to run So I think for the purposes of this kind of exercise, somewhere between 10 and 20 should be okay. I’m actually gonna leave it at 20 by default here And then some of these parameters are more about fine-tuning of these groups I would leave them a default here because we’re basically just looking for groups that would correspond to two classes that have a fairly good contrast. They’re all fairly large amount of the landscape so we can just leave these at default. The interesting menu here is the segment attributes menu, where it tells us, when we do look for groupings in the region, what do we want to take into account among region properties? And we can use color, which is the first, one mean digital number, which is the value of each individual layer, red, green, or blue for that regions, but we can also use standard deviation because we saw earlier that some features like trees are actually more heterogeneous than bare soil or rows. And that can also help differentiate them even when everything else is not, like the color, is not a very good determinant, like with those purple trees This is not going to be for GCC anymore because we’re not using GCC as an input This is going to be values of the original bands for these little regions here And so with all of this, we are ready to hit run I should again give you a warning that this particular step can take a while And for me it took in my test runs longer than even segmentation did So if that happens to you, you can load the output of this unsupervised classification from the files that we provided for our final discussion and the last steps of today’s workshop So if it takes a really really long time, please know that you have this backup option available to you I’m going to give it a couple of minutes it looks like it’s on its way And while it’s running I’m happy to take a few more questions. So I’m going to open my chat window right now and see if any questions come up while the classification wizard is running Chippie Kislik: Iryna, it looked like one of the questions earlier was how we can mask out water Iryna Dronova: Ah water. So that’s a really really good question So in this particular case, water is very green and it’s a little bit difficult to mask it out based on the thresholds of GCC, for example, because it looks so much like other cover types. So there are a couple of strategies about this. One I would say is you can threshold it separately using its own indicator. For example, you could

look at the red band, the original red band. Let me see if I can do it while the classification is running. If I can, yes it allows me. So I’m going to go to my data, just add the red band, which is the first band alone Sometimes water in the red can be very dark. It’s not the case here because of the chemistry of water. If it’s the case in your image, like if you have a data set and you load only the red bend of the image and water is distinctively darker, you could develop a threshold using this sort of classify methodology, like we found a threshold for GCC mean, and separate it out first and allocate it to a different class or create a mask by reclassifying water. Another technique here could be using the water specific index, like GCC, but the one that’s sensitive to water I mentioned earlier when answering questions that one, sort of the quick one that comes to mind, is the difference between red and green, so red minus green, divided by red plus green would be a very good way to do it If you have access to infrared, like near infrared band on your instrument or data, that band is very sensitive to water presence. You can use either near infrared band or some indices with it to threshold and then eliminate water as a mask And so that kind of, you could threshold it out If sometimes you can also, I wouldn’t say cheat, but sort of make it easier for yourself, if your water is only one object in the image like here, you could segment it out and manually set that polygon, basically manually add its attribute value or something to force it to be water If you’re only mapping a fairly small study area and you have very very few very recognizable water features, like a pond this kind of private pond in this example, you could simply assign it to a class separately. That is also an option. It’s not ideal when you look at large landscapes or you want to automate something We always want to rely on quantitative methods as much as possible for automation, but sometimes, depending on the purpose of the map, that could be a quick work around. Water can be also very smooth. So in this case it could also, you could potentially get it by a few different characteristics. That it’s maybe green but super, super, smooth, much smoother than some other green features like trees and using both standard deviation and greenness You could isolate it by querying and saying like in the attribute table, for example, find me an object that is very smooth but also green and help narrow down. There’s more than one strategy to get at this, but water mapping is a whole other interesting area which you could tackle using very similar principles with vegetation mapping, except with more water suited indicies Okay this is a great question. How challenging would it be to use RGB for classifying tree plant species> Right so it is a little limiting in the sense that some of the final, a lot of trees are green And a lot of vegetation types would be similarly green and, with respect to trees, sometimes they’re similarly heterogeneous So if you only have RGB data available, you could use, I would highly recommend specifically, using the segmentation based analysis or this classification wizard workflow using segmentation Because then you can add not only color and standard deviation, but you can also add shape You can add some other parameters that could help provide additional fingerprints or additional attributes for this classification For example, you could say that maybe individual trees, that they’re roundish, they’re more compact in shape, and you could compute their geometry parameters even manually through field geometry in ArcGIS Desktop or Pro OR you could also use, if you notice in this classification wizard menu, you could have some other features like rectangularity, compactness, those deal more with shape and blockiness. And so something like a vineyard, shrubs, that form these very elongated strips, they would be different in these parameters from something like trees. And that could provide additional attributes, even if you are fairly limited in color So there’s definitely possibilities to do that but it would be a more multi-step process. Similar for vigor. So can you compare the vigor of adjacent grape rows. A lot of these indices like GCC are actually quite similar to a proxy for vigor because if you lose vigor, you lose greenness, and you will start seeing nuances. It’s almost as if you were classifying different tree species, but you would be now classifying different vigor classes And you could either approach it as a classification problem. So you could

basically sort of stratify break down this GCC palette into different figure classes, or you could numerically relate. If you have measures of vigors on the ground, you can develop a simple like a equation or some simple mathematical model relating sample locations with grapes corresponding to different vigor levels or classes to their GCC values from the image, if you have them gps and you have the exact locations of them. And that would allow you to develop a little model potentially predicting vigor from this greenness. Indices like GCC are kind of well suited for that even if you don’t have near-infrared data For shadows, same thing. You can treat shadows as a different class And you can then isolate it either by thresholding. They often would be darker than everything else, sometimes potentially more smooth, or you can identify them as another grouping in the data through this workflow. Which now you can see that my classification stopped running and it brings me to this last step. And I’m gonna keep talking about shadows and other things, I haven’t quite finished answering the question yet, but it very well applies to what we’re going to do next here in the image So this particular unsupervised workflow is neat because it’s unsupervised. We don’t know what groupings the software will find in advance But if we allocate some number of classes, in this case we use only five but we could say 10 or 12, there’s a high chance that some of these spectrally distinct things that have also maybe a different standard deviation, or if we choose compactness, rectangularity, also different shape, they might end up in the same group. And then you can find them through this process and isolate them into an individual class. And now you can see that the software has sort of played with this analysis and arrived at some kind of class here output Your screen might look a little bit different from mine But one thing that we can do is say whether we want to keep playing with this classification or not Because we don’t have time to keep playing with this, I’m just going to click next and save this as a data set before I look at it in the contents. But you could also keep playing with this later if you wanted to explore it a little bit more So I’m just going to call this whatever classification, I’m just going to say classification1 I believe in your exercise we have a slightly different name ‘unsupervised vineyard.’ Yes so in in the your exercise I’m also using this kind of segmentation parameters to keep track of how this classification was produced We can leave these as blank or default but sometimes we can save additional parameters from this process. If I click run for this it’s just going to save it and load it here in the contents using the schema, or the sort of predefined set of colors from the palette that it’s using in that LCD that was the default one we chose But we can change symbology to however we would like in the next step. So just a second here while it loads, I’m gonna keep talking about shadows. So what this what this does is produces the raster data set where different color categories correspond to those five classes that we had earlier set, that’s going to be our maximum number of classes or groupings that the algorithm would search for There are not our final classes yet They’re basically like suggestions, they are these candidate groups. And while it’s loading the classified data set, you can see this temporary preview has five categories ranging from zero to one to four. Okay I can unclick that preview now and just focus on the top one now that it’s loaded It might look very similar on your screen to mine. Now one thing I should say– these groupings are nominal. In other words you, when you run this software, will also end up having five of them, but where they fall in the landscape with those colors might be different from what they look like on my computer. So you would produce very similar map but when we look at these classes, when I say in my case here for example class zero, which is this paler cyanis green, is obviously the non-vegetated it’s like the roads and everything else, I’m gonna unclick intermediates for you to see, but actually it may be not zero in your case. In your case it could be number four or number three. So please keep that in mind for this exercise because every computer is a little different and these labels are a little bit random The groups will be very similar, the actual groups, but what they’ve been labeled might differ from a computer to a computer. So I am now going to right click here and go to symbology Chippie Kislik: And maybe we can show them how to load in the

the data if the classification didn’t work Iryna Dronova: Yes that’s a good idea, Chippie Thank you. So if your classification didn’t work, you would go to map and click on add data and navigate to our original folder from the data exercise And in the final products here you would find this product called ‘unsupervised vineyard segmentation tiff.’ So this has been already saved for you and I just load that one okay so you can see, even between this and the one that I had on my computer, they look very very similar. But what label ends up for which category is a little bit different between the two. So if you run into that problem please just kind of note what they might be on your screen, and what different categories correspond to will be individual from a computer to a computer But we can change that. So why don’t I work with the one that we have loaded Maybe that would be easier because that way you can reproduce this from that same file if you wanted to try this later So I’m highlighting this file that I just loaded with the classification result and I’m right clicking and selecting symbology, so it’s already done here And in this symbology we can manually keep track of what we would like to do. We don’t have even change it now because we’re going to reclassify this into a binary map again But for convenience we could keep track of what is what and what is likely to be vegetation versus not So for example this class zero that is dark blue, if I unclick and see what it is, it does look like it’s part of this vegetated launch here So actually that could be called vegetation. So I’m going to right click on its label here and call it ‘veg’ just kind of for my own information here in this label field. So class 0 in this case in this output is veg Class one is this bright green vineyards, water unfortunately, but also a lot of trees. So I’m gonna call that veg for convenience as well It definitely looks like this class is also corresponding to green pieces The red color here it looks like it’s some of these trees including purple trees. They are end up in what is red here in my image. Sorry my software is freezing up a little bit but I can see that they’re corresponding to trees so I’m going to Something’s freezing with Arc Pro. I’m going to call that veg as well And then the other two classes, the purple, and the purple also looks like it actually parts of the trees so I might call that one veg as well in this output. But then class number four is clearly a lot of that non-vegetative background, so I can call it non-veg. and again in your case it could be a zero or it could be something different So this is just for ourselves but, to again convert this to a binary map like we did in the earliest steps, we have just to do one more thing. We have to reclassify it So we would go to our familiar analysis tools, reclassified still here from the previous analysis that we ran. I’m going to again search for this tool I’m gonna bring it back up. But now I have these four classes to reclassify. So my input raster is this classification output. I’m just gonna use the one we loaded most recently Reclass field is still value. My output will be, I’m just going to call it here, unsupervised veg map, and it had the tiff extension by default so I’m going to keep the tiff extension by default as well And here I’m going to click inside this start end thing, but instead of value, so actually this is important, instead of value I will choose class name. And it’ll change the layout of this. Now I don’t have start and end anymore, I have value, old value versus new value. So I am going, in this example, I’m going to do slightly different from the previous type. And I’m going to actually set up everything that’s not vegetation to no data So that, if you remember, was label number four in my case So I’m gonna specify that my zero, my class that was initially produced that had label zero, it will be equal to 1. Oh sorry zero will be equal to one in the new value And then I hit enter. My class that used to be one in the original value would be also one for veg. I’m gonna hit enter

again My class that used to be 2, in my case again, I would also make it 1 for vegetation And I’m going to hit enter again. My class that used to be 3 is also going to be 1, so we have four classes that look like vegetation And again in your case it could be three classes and not four because these groupings could be a little bit different from computer to computer And then the final one would be the class number four, that would be non-vegetated. I could make it equal to zero or I could make it equal to no data if I wanted to completely eliminate it. If I only wanted to create a mask that would be just vegetated areas, I also could call it no data. It’s different from the previous step, but this is another version of how you could approach this vegetation mapping and masking. And so now that this is done, I am going to click ‘change missing values to go data’ Sorry my reclassify failure for some reason here Okay something didn’t quite run here. so I’m what if I try my this one to zero, just to be safe. Okay no so I’m sorry everyone Chippie Kislik: It worked when I ran it through my geodatabase, I don’t know if that’ll change your output Iryna Dronova: If you save it outside of your database? Let me let me try and save this outside of my geodatabase. I’ll just call it let me just do VegMap2. If you if it worked for you when you did it inside Yeah so now I’m saving it outside of my geodes it works Maybe that was the issue. And actually yeah thank you Chippie. So if you’re running in the same hiccup as I just did, when you do your output raster, right click and manually like make sure that it’s outside of the project geodatabase, so it’s saved as a separate tiff file outside that you can later use for your mapping Let me see if I can do this as I wanted to do with no data here instead of zero, and I would just call it vegetation map I would do the same thing. I would save it outside of my geodatabase but now I’ll call it VegetationMap0 for example. And it should still run again, I’m just making sure. Yeah so it would also run but now it has only one class as you can see and you can later change its sympology to label So if we compare this to our original image, you can see that it actually captures a lot of vegetative areas quite well except for lawns. So this was an interesting result It seems like this unsupervised classification was very sensitive to woody vegetation, but some of these herbaceous lawns, they may be different in color, they may be different in smoothness, so they they didn’t get captured by this Snd so one way to address this in the future work could be to increase the number of classes and label classes to kind of allow a little bit more of them and some of them might pick up this unique vegetation type a little bit differently We could also try different attributes in addition to the ones we tried, or remove standard deviation also as another possibility So this is kind of a workflow that you could adapt and make more complicated as you need in a variety of applications and contexts So with this, so I apologize about this little hiccup with the last tool, but with this I would like to say thank you for trying this exercise and for being here at the workshop I am happy to take any questions. So I’m going to save my project, again we want to save your projects I have some ideas on how you could extend it further, but I would rather hear your questions and see if if there’s anything immediate that you’d like to discuss. So I’m going to hang out here for approximately 15-20 minutes longer and if you would like to contact me, I’m retyping my email in the chat again right now so it’s idronova@berkeley.edu. So you can contact me I’m gonna put everyone in meeting Right but also it’s on my slide, so please feel free to reach out Or if you’re using these tools and you would like to later share how you have adapted this process, I would like to hear from you as well Thank you all so much