Image Processing in Python with Scikits-image

my name is stephan fauna vault I’m a postdoc at UC Berkeley I come from South Africa where I’m an applied mathematician pretend to be at least and yeah today I’d like to talk to you about cycads image the image processing toolbox or Python it’s like it’s image is slightly different than many of the other projects you’ve seen so far so project like numpy pandas they’re fairly widely applicable to all sorts of different domains cycads image is the only project here that’s fairly focused on the specific domain of in this case image processing but yeah the domain of application for image processing is vast image processing is used in science you see it in my cross microscopy astronomy biology it pops up in computer vision in engineering and that’s what to name a few in industry we see image processing happen in his interesting places as well robots controlled in industry to perform automated inspection or control of production used to process satellite data we at the strata conference there was a talk on augmenting sports television so for example when baseball players pitch they want to know where did where did the ball land you know what was the angle of throw the speed etc and a lot of that is done with cameras following objects around and a lot of image processing happening behind the scenes there and then there are some toy applications you know like when you scan these little micro arrays the little black and white patterns that you see on conference posters advertisements etc some image processing happening there as well for me personally what’s make what makes this project so appealing is that we really push the limits of what the current Python tools can provide us we rely on base packages such as num pay but we actually require more than what those tools can currently provide so we love programming in Python we love what the current tools can do but we need more than what they can currently provide so we get quite excited by all the new developments that are happening in that domain as well so like I said we’ve got fairly high requirements of these tools the for example speed is an essential requirement for us people might want to do real-time processing of image streams all work on extremely large images but without compromising on the aspect of speed we also want to make to make it easy for people to process images – to easily Express the operation I want to perform in a high level language but still have it execute extremely rapidly so for example when Travers proposed yesterday a solution where you can inject functions into numpy using something like LVM that kind of thing gets is very excited because we’ll be able to program in Python but still get the high performance benefit down the bottom we often need to leverage different pieces of hardware available like multiple CPUs GPUs you know we’ve got back entity to utilize OpenCL etc so those making use of all those platforms or all those you know hardware capabilities also pose some extra challenges and then all of these things need to be exposed to the user in a way that doesn’t make it difficult difficult to use we’ve learnt a lot along the way in developing cycads image we’ve learned how to build robust pipelines make it easy for users to string together a whole bunch of image processing functions and and just have that work without doing any magic in those pipelines we need to very efficiently transform data so interesting things happening there and then deal with all the different ways that people represent their data there are some subtleties when you represent images like where is the center of a pixel you know pixel is a little block you can think of it as little block of color or you can think of it as a data point so is the origin of their data point in the corner in the center you know what kind of data types to use to represent your images we’ll talk a little bit about that later on as well and then finally it’s been an

interesting experiment in building a small community where we’re fairly small we’re like four or five active developers and so but we’re not part of the big cycads a sci-fi project at the moment you know we’re doing our own thing we’ve got our own website and so on so it’s been interesting to see how that all came together so if you look on the screen here I’ve got a layout of roughly how all these components fit together so most of most of our package is actually built on on top of siphons so we we leverage siphon to get speed and numpy both on C primarily also using some siphon for the random number generation for example so like most of you know on top of numpy we’ve got the higher-level utilities such as scipy matplotlib that we use extensively and yeah on top of that cycads image in the image comes from I think it might be from number a it was developed in the Astronomy community at least by Peter for fear it was integrated into sci-fi but in the image is written in pure C it’s a little bit hard to get people to develop on in the image because it’s hard to wrap your mind around it so with cycads image we try and make a much easier entry point into image processing and aim to vastly extend the capabilities of nd image so just to give you an idea of what is available in side-by so an indie image itself let me well it’s good to just go to the side by docks right so you see in the in the image package we’ve got nd image filters they’re just common filtering operations filters relying on the Fourier transform there’s some interpolation that we need when we do geometric transformations on an image so when you walk when you skew an image you apply spline filters for example and then some methods for doing measurements on images like if you’ve got a whole bunch of image areas how do you measure the size of those areas how many there are etc and you’ll see an example of that just now some morphological operations they operate on the structures inside an image and yeah that’s about it so what I’d like to do is to run through a typical example of how would an industry problem look and how would we solve such a problem all right so I’m going to import in the images NDI right so there’s the problem definition that we have where’s my image there we go so this is the image we’ll be working with it’s a microscope image of some heated up glass so there’s the description they say that this specific glass sample the glass is the light grey background that you see there there’s some bubbles black bubbles in there and unmalted sand grays those are the light the slightly darker gray that you see in between and when you’ve got a slide like that what you typically want to do is count how many different types of each phase there is so what area of this image essentially is covered by the light gray glass background what fraction is covered by the black bubbles and by that the sand the dark gray how many bubbles of each kind what’s the average size of the bubble how many bubbles are there that sort of thing so how would we approach such a problem well the first thing we need to do is we need to get the image from the disc into the computer so can you all read what it says here should I make it a bit larger it’s okay yep right so we we started in this case I’m just going to use nd image and mat blood Lib to show you what those tools can do and then I’ll show later on how cycads image builds on top of that so we load Mac blood Lib we’re just going to use that to read the image we import numpy as well and we use Medford libs in read function to read in the image and then

we display the image we specifically set the color map to gray because by default there’s I think which is the default color map John with its Jed right typically so you see like a whole bunch of colors usually if you try and display grey image so we set it to gray by default and importantly set the interpolation to nearest because otherwise if you zoom in on the image it will you know smooth it out as it interpolates those points for you so we just set it to nearer so we see the exact pixels that we load it in the first problem we have is that the microscope in inserts a banner here so we just want to get rid of that banner that’s fairly easy to do with numpy the image we loaded is simply a numpy array so we can apply all the standard dump our operations to it in this case I just choose to take the first 880 rows throw the rest away so that’s the part of the image without the banner so there we go okay so just discarded that bottom but now you’ll notice that there are you can’t really see it on this on the screen too well but there are a whole bunch of speckles inside these black blobs that we don’t want so how do we get rid of those well in this case I ran a median filter on it the median filter is basically just a windowed operation so you slide a window across the image and inside of each window you say let’s pick the middle value so you sort all the values you just pick the middle value and that tends to get a right read of speckle noise you know but very high and very low values so so that’s what we do next right and there you can see the the faders seem to be much cleaner we don’t have as many speckles lying around now we the next step we need to do is to separate out those three phases so we’ve got the light gray the glass background we’ve got the black blobs and then sort of the dark gray sand grain so how could we separate those out what could we do any suggestions yeah oh I forgot that you guys can see the solution right so so what we want to do is we want to basically set up three different masks we want to threshold this image in two different places but you can take all the light all the darker pixels or the lighter pixels and then the grey pixel between but we don’t know where to take those threshold so the way we investigate that is to is to plot a histogram of all the values inside the image and when we do that you can clearly see that there are like three different modes here so this would be the zero intensity pixel so that those are the black blobs these are the darker gray sand grains and the white here or the light gray is the glossy background so where would be a good place to threshold I think I picked 50 so took those values out then all the values between 50 and 120 and from 120 upward right so that’s exactly what I do here I say I create three different masks for me this operation it’s a toothpaste it says is the median filtered image less than or equal to fifty so that will return a boolean type in the array for me true or false values will be a mask for the bubbles the sand and the glass and then I just have a little bit of code there that that plots all three of those right so there we go we see the original image we see the bubbles isolated we see the sand and we see the glass what would be convenient is if we could take all these layers and combine them into a single visualization so how would we do that well typically we represent images as an m-by-n array and that array is then just contains intensities but you could also make an M by n by 3 array and those three layers would then be red green and blue so we construct a new empty image of the same shape as the original except we add that extra dimension of RGB and then we proceed to fill those layers with different values according to the masks so wherever there are bubbles we write the value red no green no blue force and we we add red and green but no blue and for glass no red no green and

full blue there we go all right so that gives you a very nice convenient single representation of your different phases much easier to recognize then on the original image and it also shows you what you labeled now if you if you notice here on the edges you still have some yellow coming out on the red so there’s still some mixing up on the phases here on the edges and we’d like to get rid of those so what we’re going to do is apply a morphological operation morphological operations change the structures inside an image so we’re going to try and get rid of these little speckles we do that again by moving a window over the image and saying if if there are structures that aren’t entirely containing this window throw them out what that typically does is it sends to grow structures inside the image so we actually change the size of these bubbles so then we basically do the inverse we say move that structure over the image again that everything you know everything that that is not contained select that so then we we shrink the structures again and hopefully get rid of of these extremities without changing the size of the bubbles too much so we apply but what they call a binary opening and a binary closing and yeah there’s our final phases that we they were going to work with all right so what’s the next step now we want to figure out we know we’ve got bubbles we’ve got sand and we’ve got lots but how do we count how many bubbles they are how do we count how many sand grains they are what’s the average area of those things etc so the first thing we do is we we label our components so we go through this image and we say identify all the isolated blocks of connected blocks of pixels and give them different values so we’ll start and we’ll say ah yeah we’ve got a blob of pixels we’ll label that one the background is typically labeled zero so this will be one Skan Skan Skan Skan there’s another one that two three four five six seven eight nine so we’ll just consecutively label the different connected components in the image right so that’s what this call here does in the image dot label it steps through the image labels all of them and then we have this single line of list comprehension to calculate the statistics we want so it says for each label I in the range from one to the maximum label find all pixels with those labels so where the labels are equal to I and just count just count how many of those there are so that gives us our number of object areas all right so there we go these are the labeled images they plotted with the very useful spectral color map which changes fairly rapidly over over the integers so you can see all the different colored bubbles coming out here the glass is mainly a single blob it’s the background and then some some others coming out but mainly the big background one lots of sand grains and yeah they are the statistics so 150 regions found in the sand 17 the bubbles that makes sense so we’ve got more sand than bubbles and many fewer in the in the glass phase and as we expect the glass phase has got an average a fairly high average size because it covers almost the whole image and the sand grains are typically very small with your plenty of them but you know that’s that’s a typical image processing pipeline that you that you might form now so I hope that provided some insight now cycads image builds on top of the SyFy and Ne ND image tools I’ll show you the home page for the project let’s let’s get there so there we go this is a pretty standard

template you’ll see the eye Python web page looks very similar we kind of borrowed from them and so you can download the project here we’ve got you know user guide a gallery of examples way to get the source code and so just a trivial example on the front page so you can get started quickly I encourage you to go look at the gallery it’s probably one of the most fun places to browse around and just see the kinds of things that people have done this isn’t an exhaustive list of what’s available it’s like it’s image but it does give you a quick preview so for example we’ve got a marching squares algorithm so if you’ve got a height map of some sort and you want to follow contours along that map you can use the contour finding you know we’ve got numerous edge detectors and here’s an example of some Instagram Equalization so if you’ve got a an image such as this one with very low input contrast you can choose different ways of scaling it you can either try and linearize the cumulative histogram or you can just sort of stretch the contrast a little bit to provide an image like that we’ve got some this is a cute example where we took a natural image just split that image up into little windows and then did k-means on on those images those sub images as features and you see when you do that on the original image you basically get something that looks like Fourier components but when you do that on the edge edges in that image or the difference of gaussians you get these Gabor like feature of you know code for coming out we’ve got the Hough transform for doing straight-line detection in images we’ve got radon transform reconstruction so that’s when you when you do a CT scan of the brain you scan the brain from all the different directions and it sums in all the different directions that gives you something called a sino gram and based on the sino gram well this is what your scanner gives you and if you want to invert that you do some filtering you do the inverse radon transform and you can do a reconstruction of your image then some segmentation feature detection denoising yeah so that gives you a rough idea of the kinds of things we do so here’s a typical example from the cycads image module I imported data filter and IO so data is where we typically get our sample images from so in this case I use the coins sample image then I apply applied the sobel filter to that image and I display that image there we go so IO dot M show typically makes use of matplotlib by default but we’ve got a whole variety of different output modules that we can use output plugins so you can display using QT or matplotlib or whatever is available in your system we could we just basically want to make it easy for users to display images irrespective of what they’ve got installed but you could just it’s a numpy array so you can just display it using normal map mod lab that would work fine in the same way as we have output plugins we’ve also got several input plugins so you could use the Python imaging library to perform image reads by specifying IO dot use plug-in pull in read and then when you read the image it is read with with the python imaging library and not with matplotlib so some some libraries allow us to do interesting things within our QT back-end we’ve got a fancy M show so if you decide to use plug-in QT for image display and you specify the fancy parameter equals true you’ll see something interesting happening we’ve got you pop up a little display like

that so your your image is displayed with histograms on the side these histograms are updated if you adjust the colors in the image you can see look at the red histogram as I just the red component you’ll see that goes higher up this all happens multi in several processes it splits up the process to more rapidly process the image and you notice here at the bottom you can follow the position of the pixel the value the the position at the at the cursor the value RGB values of the cursor use saturation value right so let’s adjust the image somewhat let’s go to gamma and well it’s first adjust I’ll add some some green to the image just slide that up a bit then this is a little bit like revision control you can choose to commit your changes or you can choose to revert them so you make cumulative changes to your image you can move up and down that stack so there that’s committed now we can go to the gamma adjustment and just push up that gamma value way up or way down that looks kind of cool let’s do that and then here’s a button call safe to stack so we’ve got well let me commit that and make this thing a little brighter okay so click Save to stack so we keep an image stack around if you push images on to that stack you can now go back to your notebook and grab the image off the stack and manipulated further so I’m going to close this viewer and I’m going to go back here and I’m going to say IOM show IO dot pop so IO dot pop goes onto the stack where I just pushed that image grabs it out of there and then we can display it so often you want to read files coming from different domains like in astronomy they use the fit image file format you might use DICOM images if you’re working in the medical domain and we’ve got different plugins for these so in this case I’m going to say use the plug-in fits and then try and read this star image and yeah there we go there’s the image now one of the biggest problems we’ve had with oops I’m gonna do that one of them did I just kill my notebook idea so so people use images with different value ranges and this this becomes quite problematic for example typically you’d store your image as a as a floating point array and all the values would be between 0 & 1 but other people store the images as unsigned integers ranging from 0 to 255 or bigger integers running from 0 to 65535 some people store the images running from – 128 227 so how do you deal with all these images in a way that’s not too confusing to the user so we we basically support all of these but we assign very specific expectations we say like if your image is in floating-point format the values must be between 0 & 1 otherwise we don’t know what you mean so we’ve got several conversion functions so you can call images float images unsigned byte images unsigned integer images int and it will convert your image to that new representation taking into account the the ranges of the image and these functions are typically called at the beginning of any filtering operation so the image the altering operation takes in any image any datatype converted to what it needs and then spits out whatever is most convenient but because all the functions take in any kind of image you can string these together and it should always just work out of the box so I guess my kernel

still running so it’s yeah so it’s all fine so um here’s that NGC image that I just loaded in the star image and if I print the minimum and maximum you’ll see that it ranges from 100 to 65535 so that was actually an integer image but the FETs plugin well the feds Beaufort’s library in python returns a floating-point so we don’t know we wouldn’t we wouldn’t be able to handle an image like that because the values are outside the range 0 1 so if you try and fall to an image like that you should see an error and that’s what happens it tells you well your image should have been values between 0 and 1 but to handle that we’ve got a function called rescale intensity so if you know what the range is of your intensity are you can you can call this function and in this case I didn’t provide any parameters to tell it what the ranges are so just picks hundred the minimum value in the image sets that to zero pick 65535 which was the maximum value say to that to one and returns that for you so if you try and filter the return image should be no problem and yeah that’s what we see it over there it’s a no problem you can also tell the rescale intention intensity function what you expect the range of your values to be so we’ve now changed this image to have values between 0 & 1 but we can sell it well they’d say the maximum was 0.25 instead of 1 so what would that do that would discard all values above 0.25 and stretch point to 5 to be the maximum value and when you do that you see that this is a much brighter version of our original image so here’s an illustration of the conversion functions I took that image and I applied I first converted it to float using images float and print it out its maximum value you can see it runs from 0 to 1 then I converted it to integer you can see it runs to the maximum integer value or if you convert it to an unsigned byte it runs to 0 to 255 now often when you want to test your algorithms you need access to some test data so we provide several test images in the data sub-packages data add camera for example you get this guy out data that coins is this image and if you call data or coins question mark and ipython it just loads the doc string for you and you can see that we’ve got says those are Greek coins from Pompeii it tells you a little bit bit about the history of the image what it’s typically used for and what the copyright situation is you know where it was obtained from so if you want to use it in your paper we try and use images that are freely available so you don’t have to worry about that so like I said it’s important to image-processing users to be able to construct a pipeline without any hassle and with a weather you know if you work within the image for example you constantly have these problems where you convert an image but the range is outside of what was expected so if you for example take an image you multiplied by 2 and you displayed in matplotlib you typically won’t see any difference because there’s just some automated scaling that takes place so that’s why we have these functions that make sure that if you multiply your image by 2 you actually see that see the you know twice the intensity so let’s look at this example of constructing a pipeline I take the camera image that I just showed you I don’t know what format that is in it doesn’t matter the canny Fulton knows how to handle that so that’s what block over here just too plain edge detection but if I first want to apply a total variation denoising on my image and then do the canny filter I can just string them together like this I don’t have to worry about the ranges the data types nothing that’s all taken care of I just string them together that’s my pipeline and when I display the result there it is so you can see the if I just applied the canny edge detector detected a whole bunch of spurious edges if I first denoise the image and then applied much much cleaner edges one of the algorithms we implement is called shortest path and I want to show you

this video on how shortest paths are typically used in let me try and make this and we I’ll just sort of narrate as he goes along he’s doing a context-sensitive resizing here so he’s explaining that on a typical Wikipedia page if you shrink the page and width you’ll see that the image remains constant in size even though the text reflows to fit nicely so this is less than than ideal here’s an example of a bear and a cub if you try and shrink that photo you typically end up with a difficult situation you either have you know you crop the image so you cut out to be so the bear is here but the cubs are over here you can’t see them or you scale the image but then everything is kind of squished up or you cheat a little bit and you retarget so here they took the image they’re trying to change the aspect ratios and so this is the retargeted image this is what happens when you just squish them there’s the image squished and this is the image cropped so you can see that over here you’ve got all the content you want but without changing the aspect ratio or losing too much of the image content you can see that all the features remain inside the image even though we’re rescaling it even if you stretch it out so he wants to shrink the width of this image of the blocks how is he going to do that so what they show is that they do edge detection on the image first and that helps them to decide which parts in this image can we can recover way they call this technique seam carving so the idea is that your eye is much more sensitive to edges than to smooth areas so if I if I took out a part in this image that ran here you know avoiding all the edges that my I probably wouldn’t notice what’s going on so I think it’s first going to illustrate now what happens when we just take take away columns in the image right so it’s now just removing columns depending on the columns that cross the fewest gradient or it’s yeah that crosses so there’s the edges again and then they detect the path that runs along the fewest edges and they keep removing those one by one so blue to red is just the order of the seam removals you can see which ones are removed first and last and here it inserts pixels to to grow the image it’s a good example they try different things using edges or say Li and C or whatever but in the end gradient magnitude seems to be fine for what they’re trying to achieve sometimes features get removed on the images that you don’t want removed like in this case look at the baby’s head that wasn’t too good so you mark the areas that you that you don’t want to remove and they basically get increased penalty values so they don’t get selected for removal this is this is a

great example like if you don’t want that guy in your picture you can actually select him and say like we’d prefer the least the cost part to go through there so we go thank you right so that’s a that’s a technique called seam carving and we actually have an implementation of this using the shortest path algorithm in cycads image often we want to geometrically transform images as well so you might want to take an image and downsample it so in this case i started off with a cameraman image and I called in the image zoom with the zoom factor of 0.1 so scale down the image by 10 and when you do that you get this is up but you can see this is much more blocky than then the original sometimes we’re going to be a little bit fancier than that so inside its image we have the transform module that does all sorts of different transforms the we can specify this transformation matrix here so it’s a three by three transformation matrix this matrix is applied to each coordinate in the image and whatever comes out that’s where the coordinate is moved to so those of you with a background in linear algebra will recognize the first 2 by 2 sub block of this matrix as a rotation matrix we’ve got cosine minus sine sine cosine the S is in there just for scaling X&Y we’ve got translation parameters here T X and T Y that’s we’re shifting the image around and then some skew factors s0 and s1 so if we just want to do zooming like in the previous example we simply set s equal to 0.1 and then we shrink the image down by by 10 times so let’s do that we call transform demography with that matrix and there’s the output now you’ll see that I cropped the output by just selecting the first 50 by 50 if I take that out you’ll see what’s really going on in the background we take the image we size it down by 10 and that’s what which still down there but you can do slightly more interesting things you can rotate the image so let’s say we want to rotate it by 15 degrees maybe translated by 100 in each direction and add a small skew factor so Oh large give oh my scale is still point one let’s a let’s not do that so I’ll set the scale to 1 there we go so they’ve got a linear warp of the image now nd image also provides the map function because this is a linear transformation all straight lines in this image remains straight lines over here and sometimes you want to do nonlinear transformations so there’s the coordinate map function and let me see if I can execute that demo right so here’s a photograph of a window frame but you can see that there’s severe distortion on the edges of this image these lines are supposed to be straight but they are very much curved and we’d like to pull them out so what I do is I provide the script with the straight lines I mark three points there one two three and let me do another lines one two three and then there we go so based on those I gave it three three coordinates and I said please optimize the warp so that those three points lie in a straight line and this is what you get out so I didn’t do a very good job of marking I but these lines are much straighter than they were in the original input image recently we’ve also added support for

block views and filtering so often you just want to slide a window across your image do some operation on that window and return the result so if you if you look in the shape utility you import view from window view as Windows you can take your your input image view it as a bunch of windows so that’s the sliding window it’s just a view in the original image so there’s no copying of data taking place you can do this on a fairly large image specify the shape that you want so your input image is 512 by 512 you now break it up into blocks of 20 by 20 so your output is 4 9 3 by 4 9 3 it’s almost the size of the original input image but these windows have to fit inside the image and each block is 20 by 20 and then you can do some operation on that block like you can get the maximum of the entire block and if you display that there we go you get something like that out so you can clearly see that the blocky structure that filter coming up if you’re looking to detect features in your image you can do you can call the hig the histogram of gradients on it that basically examines the inter image of each pixel position and it looks at the gradient and it counts the number of gradients in each direction and here’s a visualization of that brighter pixels indicate more more gradient found in a certain directions you can see you can basically see the original structure of the image coming out lots of vertical here along the pipe and sorry horizontal and more vertical along there for example on his shoulder because the gradient runs horizontally there right I think I think that’s about it if you interested in image processing or you you can use some of these utilities feel free to get in touch with as we like to to welcome more people into our community and the mailing list is populated by guys who know a lot about image processing so feel free to come and ask questions or just discuss the tools were using seems like we’re done right thanks very much