Skip to content
TwigBeard edited this page Jun 29, 2016 · 30 revisions

In progress --> Page Needs Additional Info

Point Classification

Three areas were identified with mining activity from Tita’s Mountain-Top Removal study. Tita’s Scene Extraction script imports greenest pixel composites, NDVIs, EVIs, and SAVIs for each mining area scene. The script imports the greenest pixel composites by collecting imagery of that area for the specified year, conducts an NDVI, and takes the highest NDVI value for each pixel and composites all those pixels into one image. This ensures that the resulting image has the best contrast between the greenest possible pixels (representing healthy vegetation) and any mining area.

The three greenest pixel composites were then loaded into QGIS, along with the 3 sets of 300 points each that were randomly generated (using the “Random Points…” function in QGIS) over each of these mining areas. The result appears as:

Figure 1. The 300 randomly generated points labeled “Alpha” overlaying the greenest pixel composite (SceneII, which corresponds to Alpha points) for the year 1984. It is a Red, Green, Blue composite.

Once the imagery and blank points are loaded into QGIS, “toggle editing” was selected for whatever set of points that were displayed (in this case it would be Alpha). Then the attribute table was opened, which has the fields: X, Y (coordinates), ID, and Class. Class is blank and the one which was edited. Next, a person zoomed in on every point and determined if it was overlaying a pixel with active mining (1) or no active mining (0). To ensure accurate classification, the NDVI of that scene was also imported, which appears as:

Figure 2. The points (Alpha) being classified overlaying an NDVI of Scene II. NDVIs have pixel values between -1 and 1, so they display as 1 band images.

The classification of the points depended on what pixel they overlay. If a point was over a bright white pixel in the greenest pixel composite, then it was classified as active mining but non-active mining otherwise. To ensure accuracy, the points were also compared over the NDVI. If a point overlay a very dark or black pixel in the NDVI then it was classified as active mining and non-mining otherwise.

This was done for every scene for a total of 900 points. Then the attribute table for all three sets of points was copied into google sheets, along with the NDVI, EVI, and SAVI pixel values that every point overlayed. The resulting google sheet was used for statistical evaluation to determine the best NDVI threshold to use in our MTR classification script.

Early (1970's) Landsat 1-3 Classification

After Andrew provided us with a script that manually assembles greenest pixel composites(but only viewable in false color, band combination: 7,3,2) and NDVI ratios. However, there are several problems to viewing old landsat composites, such as lower resolution (60m pixel size compared to 30m currently) and Google Earth Engine may not have enough annual scenes to compile quality annual composites. However, I tried downloading 2-year composites, which did appear better quality than a 1-year composites. After downloading 3-year then 4-year composites, I decided the 4-year composite looked the best and easiest to classify (consisting of years 1972-1975). This needs to be replicated in our script. We may need to display classified mining sites for years 1972-1975 when using landsat 1. In the landsat 1 statistics spreadsheet (named MTR_VI) there are a lot of pixels with values between 0.51 and 0.58, which account for a large amount of false positives in Andrew's ROC derived thresholds (0.57635 and 0.58705). This could be because from 1972 to 1975, mountain-top removal and surface mining began (most likely in 1975) and so construction began in certain areas. However, I do not know for certain, and these large amount of FP's only occur in the landsat 1 classifications and not in the landsat 2 classifications.

The same problem occurred for Landsat 2. After exporting a composite for only 1 year (1975, the launch of landsat 2), there was a large portion of the first scene with missing data, holes with no values. This may have been due to cloud cover, or simply not enough scenes. I decided to go with 3-year composites (1975-1977) because there was sufficient data to do classifications, and to leave the year 1978 for landsat 3.

However, after correcting for a lack of landsat 1-2 scenes, I used the false color greenest pixel composites and NDVI's normally by overlaying our designated points, and classifying them as either non-mining (0) or mining (1).

We may need to derive different sets of points for these early years. These point sets (Alpha, Bravo, Charlie, Control) were derived around already known mining sites (for sometime in the mid 80's? or 90's?). These classifications will need to be examined to determine if there were even mining sites.*

Exporting NDVI classifications from Earth Engine

After various statistics, we determined the best NDVI threshold (0.51) to classify active mining sites across Appalachia. This script did this classification for several years, including 2005, 1995, and 1985. We wanted to export imagery with our classification to compare with a similar study. So we used coding similar to Tita's script.

Where Tita's script separated the imagery into 4 quadrants and assigned them to jsonCord variables, we kept the study area in one big image and assigned it to one jsonCord variable. We could do this because the resulting imagery was very small (about 1-5 MB).

Campagna Study Comparison

The classification imagery then had to be compared to David Campagna's vector files tracing mining sites in 2005, 1995, and 1985. First they had to be rasterized. Next, the Campagna rasters were subtracted from our classification imagery. The result had pixels with a value of 1 where we classified active mining and David Campagna did not. And if pixels had a value of -1 then that's where Campagna classified and we did not. And of course 0 meant both classified active mining.

However, 1985 presented a problem. The imagery exported from our classification script seemed to have a large strip that were not collected from that area. The strips had many pixels with low NDVI values so they skewed our classifications. The exported 1985 imagery literally appeared with a rectangular strip of supposed active mining sites and did not fit the surrounding landscape at all.

I wanted to better determine what the strip was, so I downloaded RGB imagery of that area from 1985. The imagery had the same strip, and it again appeared to not match the surrounding landscape. There were also several bright white areas which could possibly be bare earth, so that's probably why it skewed our classification. But we still needed to compare 1985 imagery because that's the only year Campagna did (not 1984 or 1986). Next I tried to clip that strip out of the 1985 imagery and cut the same area where the strip is from 1984 imagery. Then I tried to paste the 2 clipped rasters together to obtain mostly 1985 imagery with a little 1984. However, this prove difficult because the clips changed the surrounding imagery to pixels with only a value of 0. So, the final solution was to clip the half of the 1985 imagery without the strip, clip the same area from the Campagna 1985 imagery, and simply do the subtraction between the two clipped images.