The Australian Antarctic automatic weather station dataset

Instructions for publishing AWS data on this website


1 Extracting ascii data from argos_main
2 Importing ascii data into Staroffice
3 Processing data in Staroffice  3.1 The Starcalc file and directory structure
3.2 Converting time to "days since 1980"
3.3 Filtering temperature, relative humidity and barometric pressure data
3.4 Processing pyranometer data
3.5 Filtering wind speed and wind direction data
3.6 Processing acoustic accumulation data
4 Extracting processed ascii data from Staroffice
5 Converting processed ascii data into netcdf format
6 Publishing on the AWS website and graphing with NCAR graphics
a Appendix A: Corrections to raw LGB20 barometric pressure data
b Appendix B: Problems that may occur with sub-surface temperature data b.1 Wrapped data
b.2 Reflected data
c Appendix C: List of all fields published in the dataset
d Appendix D: Ongoing processing notes

1. Extract ascii data from argos_main

Open the updated database for a given year of the station to be processed and published. For example, let's say you want to publish all data from the site LGB00. In the website data table, for LGB00 has three separate stations listed which have operated at the. Therefore three different datasets need to be published.

The first station is labelled "LGB00" in all the files relevant to that AWS. The next station to operate at that site is called "LGB00-A", and the last "LGB00-B". Should a new station operate there, it would be called "LGB00-C", the next "LGB00-D" and so on. We want to extract an ascii version of the all data for the first station "LGB00" from argos_main. To do this, one first needs to choose the data year to process by choosing an appropriate configuration file. Since LGB00 starts operation in 1982, that is the first year to process. Assuming you've already picked the 1982 config file, go to the "File" menu in argos_main, and choose "Show databases". The subsequent screen that appears will provide a summary of all databases for the year 1982, but you'll have to change to the updated set of databases by clicking on "List sensor database". To extract the ascii data of this database, click on "Save as text file", and choose to save all 12 months of data. Save the file using the following nomenclature LGB001982.txt. 

You should save data in a working directory named after the station, the location of which is listed under The Starcalc file and directory structure section below. 

2. Import ascii data into Staroffice

Continuing with the LGB00 example of the previous section, the file LGB001982.txt can be imported into Staroffice on a unix or linux workstation. To do this, start up Staroffice, and in Staroffice go to the "File" menu and choose "Open" (file>open). Go to the LGB00 working directory (see The Starcalc file and directory structure) and choose the file LGB001982.txt. Before opening this file, choose the format to import text into a spreadsheet. That format is listed under "File type", and is "< Text - txt - csv (StarCalc) >". Now you can click "Open".

A GUI will appear on screen that asks how to allocate spreadsheet columns to the data. Choose "Fixed width", and manually add the columns to each field of the AWS data. Once this is done, open the dataset and a spreadsheet should pop up containing one year of the station's data. This LGB001982 spreadsheet contains a lot of extra characters and data not required. Below is an example of a similar raw spreadsheet to LGB001982. In the example, only the cells coloured grey are the ones to be kept. The others should be deleted. Note that "Battery voltage" is deleted, because it is not published. Solar power output (abbreviated "Spa") is also deleted should the station you are processing contain the Spa field, as are any other fields not to be published. A list of all fields that are published in the AWS dataset can be seen in Appendix C. Note that text output from the argos program no longer contains separate headings for each month, and there is no indication of the start of each month in the raw 12 month argos_main text output. This is not a problem, since the algorithm for translating time of observations (see section 3.2) automatically take this into account.

Day:   Time:  AiT4  :   AiT2  :   AiT1  :  SST10 :   Bar   :   WDi :   WM4 :  WM2 :  WM1:  Bat :
22 - 00:14    -34.46    -34.51    -34.61   -40.73     704.0     208     7.3    6.9     6.4    14.1
22 - 01:02    -34.08    -34.19    -34.29   -40.73     704.3     208     7.3    6.9     6.2    14.4
22 - 01:50    -32.57    -32.61    -32.74   -40.73     704.3     214    6.8     6.4     5.7    14.8
22 - 02:38    -32.30    -32.28    -32.41   -40.73     704.5     208    7.7     7.3     6.6    15.2
22 - 03:26    -31.23    -31.25    -31.29   -40.73     705.0     220    8.2     8.0     7.3    15.3
22 - 05:02    -29.84    -29.78    -29.83   -40.74     705.4     208    7.3     7.1     6.6    15.9
22 - 06:38   -28.04    -27.96    -27.93   -40.74     705.7     197    8.4     8.0     7.5    16.0
22 - 08:14    -26.33    -26.25    -26.28   -40.74     706.1     191    7.3     7.1     6.4    16.0
22 - 09:50    -25.47    -25.45    -25.42   -40.74     706.8     186    7.5     7.1     6.6    15.9
22 - 11:26    -25.39    -25.33    -25.34   -40.74     707.2     186    7.8     7.5     6.8    15.6
22 - 13:02    -25.75    -25.77    -25.82   -40.74     707.7     186    6.6     6.4     5.7    15.2
22 - 13:50    -26.52    -26.59    -26.70   -40.74     708.1     186    6.2     6.1     5.3    15.2

You should also delete any blank lines in the dataset, and make sure that only one heading of Day, Time etc is given at the top of the spreadsheet, and delete all other headings further down the file if they exist. In short, clean up spreadsheet so it does not contain any gaps, only has one heading, and only has the fields of use. Continuing our example, the spreadsheet above would end up looking like this: 

Day: Time:  AiT4  :   AiT2  :   AiT1  :  SST10 :   Bar   :   WDi :   WM4 :  WM2 :  WM1: 
22 00:14    -34.46    -34.51    -34.61   -40.73     704.0     208     7.3    6.9     6.4
22 01:02    -34.08    -34.19    -34.29   -40.73     704.3     208     7.3    6.9     6.2
22 01:50    -32.57    -32.61    -32.74   -40.73     704.3     214    6.8     6.4     5.7
22 02:38    -32.30    -32.28    -32.41   -40.73     704.5     208    7.7     7.3     6.6
22 03:26    -31.23    -31.25    -31.29   -40.73     705.0     220    8.2     8.0     7.3
22 05:02    -29.84    -29.78    -29.83   -40.74     705.4     208    7.3     7.1     6.6
22 06:38   -28.04    -27.96    -27.93   -40.74     705.7     197    8.4     8.0     7.5
22 08:14    -26.33    -26.25    -26.28   -40.74     706.1     191    7.3     7.1     6.4
22 09:50    -25.47    -25.45    -25.42   -40.74     706.8     186    7.5     7.1     6.6
22 11:26    -25.39    -25.33    -25.34   -40.74     707.2     186    7.8     7.5     6.8
22 13:02    -25.75    -25.77    -25.82   -40.74     707.7     186    6.6     6.4     5.7
22 13:50    -26.52    -26.59    -26.70   -40.74     708.1     186    6.2     6.1     5.3

Once all cleaning up has been done for the given year's dataset, save it as a "Starcalc" file. As an example, one would save LGB001982 in the LGB00 working directory under the name LGB001982.sdc (the .sdc extension is given to Starcalc files). This importing and cleaning up step must be done for each individual year of the station's data before one can proceed to the next step, and it is important to be done thoroughly to avoid errors later. So, if you were to continue with the example of LGB00, you would go back to the previous step and extract ascii files for all the years of LGB00 operation (1982-1989), then turn each of them into clean Starcalc files. 

3. Process data in Staroffice

This is the most important step in publishing AWS data on the web. This step involves editing all dropouts in satellite communication with the AWS, errors in wind data caused by freeze ups of anemometers and wind vanes, malfunction of pressure and temperature sensors, converting pyranometer data into daily cumulative incoming solar irradiance and correcting accumulation data for air temperature changes. This step also requires the format of the AWS observation times to be changed. The aim is to construct a dataset where every erroneous datum is identified with the value "99999.9". 

Algorithms to filter and convert data have already been set up in Starcalc, and creating a new dataset is mostly a matter of cutting and pasting the appropriate columns from another AWS Starcalc file to the one you are using. As always, when cutting and pasting algorithms in spreadsheets, exercise extreme caution to avoid inadvertent mistakes you may not detect. It is strongly recommended you copy all algorithms from /v/argos/awswebsite/staff/A028-B_1998_2000.sdc.Z (note this file needs to be uncompressed before opening). which has data from all sensors used on AWSs, contains the most refined algorithms, and has been carefully checked for errors. 

This section is written to guide you through setting up a new station's spreadsheet from scratch, and along the way explains what columns do what and why. The best way to understand it is to work through an example as you read the instructions. 

3.1 The Starcalc file and directory structure

The first step in processing AWS data is to set up a Starcalc file. Three spreadsheets are required within each file: These are called "Preprocessing", "Processing" and "Postprocessing". Open up A028-B_1998_2000.sdc, but beware it may take a while. Please note the following points to observe when processing in Starcalc:
  • The datasets are very large to be manipulating in spreadsheets. Limit Starcalc file size to 2-3 years of data. 
  • Run Staroffice on a computer containing at least 4GB of RAM.
  • Never hand edit data in the Preprocessing spreadsheet, as it can cause significant problems in the algorithms used to filter data. 
  • Never graph data in Staroffice as it may crash the computer.
  • Cells coloured in blue in proceeding examples must be changed for each new Starcalc file.
Within the A028-B_1998_2000.sdc file, you will notice access to the three spreadsheets is via the tabs at the bottom of Starcalc window, in the same fashion as an Excel file. Click on the tabs to move to the desired spreadsheet. 

The Preprocessing spreadsheet contains only "cleaned up" raw data previously discussed in section 2. The raw data prepared for each individual year should be concatenated to form a multiple year raw dataset. 

The Processing spreadsheet always has six initial columns that process the observation times. Of these columns, column "C" is automatically copied from the Preprocessing spreadsheet. This spreadsheet also contains copies of all other data from the Preprocessing spreadsheet, followed by several columns for each data type (e.g. temperature or pressure) which filter and process the data using algorithms discussed in sections 3.3 to 3.6 below.

The Postprocessing spreadsheet copies all columns with a heading ending in "(ed)" from the algorithms of the Processing spreadsheet. This spreadsheet contains only data to be published on this website. 

File nomenclature and directory location should be consistent so that it is immediately apparent what each file and directory contains. File nomenclature is {station name}_{starting year of data in file}_{ending year of data in file}{extension} where extension is .sdc for starcalc files and .txt for text files. The files should all be kept in a subdirectory named after the station being processed, together with all other files involved in processing data from that station. Remember to regularly save your work. The approximate memory required per year per station is 15 MB. Current (27/2/2002) file directory locations on Berg are:

  • /v/aws (data)
  • /v/argos/AWS (software)
  • /v/argos/awswebsite (website)
You should save all data processing you do for a station in the same directory, and treat it as your working directory.

3.2 Converting time to "days since 1980"

Data from argos_main simply gives the day of month and the time of an observation. These times must be converted to "days since 1980" in order to publish the data. To do this you need to go to the Processing spreadsheet of the file A028-B_1998_2000.sdc file and cut the first 25 rows of the first 6 columns (A through F) and paste them in the Processing spreadsheet of your current editing file. You will notice that these columns have headings as shown here: 
  A B C D E F
     1  Year Month Day No. rows: 13673 Days since 1980
     2  1998 11 6 05:31:00 36105.229861111 6884.229861111
     3  1998 11 6 07:31:00 36105.313194445 6884.313194445

In this example of the first 6 columns of the A028-B_1998_2000.sdc "Processing" spreadsheet, the blue cells signify the cells you must change in each new Processing spreadsheet established. Cell A2 contains the first year of the data being processed. Change this as required and all other years will automatically change. Cell B2 contains the first month of the data. Change this as required, and each time a day number decreases, the month will click over to the new month. Beware that if a station's data misses an entire month, you will need to search through your data to find the place where this occurs, and manually change the month number of the cell where the skip occurs.

Once you have made these two changes, fill down (to do this go to the top menu: edit>fill>down) from row 25 to the number of rows in the Preprocessing spreadsheet. This action simply copies the time algorithm for each set of observations. The total number of rows in the Processing spreadsheet should then be entered in cell E1

You now need to setup the "Postprocessing" in your spreadsheet. Click over to that one in the A028-B_1998_2000.sdc file and see what appears in the first column: It is simply the "Days since 1980", because this is the timestamp we want to publish for each AWS observation. Set up a similar column for each row of data in your Postprocessing spreadsheet. You can do this by setting cell A1 of the Postprocessing spreadsheet to equal cell F1 of the Processing spreadsheet, and then filling down to the total number of rows you require to complete the time conversion. 

3.3 Filtering temperature, relative humidity and barometric pressure data

This section describes how to filter sharp jumps in temperature or barometric pressure data usually caused by errors in data reception and occasionally caused by maintenance on the station. Already setup in A028-B_1998_2000.sdc is a simple algorithm which automatically checks the data and replaces erroneous datum with the value 99999.9. The algorithm headings in A028-B_1998_2000.sdc can be seen in the example below for filtering 4m air temperature: 
  .................... S T U V ....................
   1  .................... Ait4:  1.19  4.77  Ait4 (ed): ....................
   2  .................... -21.46 0.45 1.66 -21.46 ....................

In this example, cell S1 is the raw air temperature at 4 metres equal to the value given on the "Preprocessing" spreadsheet. The header cells are given values as follows:

  • T1 = (factor1)*(mean backward looking standard deviation)
  • U1 = (factor2)*(mean forward looking standard deviation), 
  • V1 = filtered data using the following algorithm:

  • A standard deviation is taken of the current observation and the previous five observations, regardless of observation time. These are the numbers in the T column starting at row 2. (Note, however that a backward looking standard deviation can only be taken when more than five observations exist, so the first five rows contain forward looking or centered looking standard deviations in cells T2:T6) 

    A standard deviation is also taken of the current observation and the next six observations, regardless of observation time. These are the numbers in the U column starting at row 2. 

    If both of the two standard deviations are greater than the values in their respective orange header cells, the datum is considered erroneous. If both standard deviations are zero, the datum is also considered erroneous. If one standard deviation is zero, and the other is greater than it's respective orange header cell, the datum is also considered erroneous and all erroneous values are automatically given the value 99999.9. All values not matching these criteria are accepted and appear in column V.

    The tolerance on the number of standard deviations accepted past the mean standard deviation is controlled by changing the values of factor1 and factor2. Typical values for air temperature calculations are

    factor1=1.0, factor2=3.5

    for sub-surface temperatures are

    factor1=5.0, factor2=5.0

    and for pressure calculations are 

    factor1=4.0, factor2=4.0

    The larger the values of factor1 and factor2, the more sudden the jump in the trace is required to reject datum as erroneous. There is no reason for both factors to be the same, and they should simply be tuned to filter out the data you know to be suspect or erroneous. Factors may change between datasets, but have been found to be similar for all processed datasets. For appropriate values of the factors for relative humidity and electronics box temperature, please refer to the A028-B_1998_2000.sdc spreadsheet. 

To install this algorithm in a new spreadsheet, copy the first 25 rows of these four columns to your Processing spreadsheet. There is no need to copy them to the -S-T-U-V- columns, but just to the next blank columns in your current Processing spreadsheet. Then fill down from row 25 to the end row of your data in the same manner as described for time processing in section 3.2. You also have to change the reference to the given cells on the Preprocessing spreadsheet. For example, let's say Ait4 is positioned in column F on the Preprocessing spreadsheet, you would change the formula for your equivalent cell to S1 to:


Fill down this formula to the end of your Processing spreadsheet in order that all temperature data in your Preprocessing spreadsheet are copied across to your Processing spreadsheet. The header in your equivalent cell to V1 will automatically change, showing that you have addressed the correct column of your Preprocessing spreadsheet. 

Although the algorithm described here is applicable to all temperature and pressure data, there will be isolated problems with raw data that have not been taken into account here. For this reason the processing of AWS data was set up in spreadsheets rather than a C or Fortran program as it is easier to alter the algorithm in a spreadsheet. Two problems that will arise are wrapping or reflection of some sub-surface temperature data, and for the the station LGB20 only, a faulty barometer. For the latter case, a correction to raw pressure data must be applied, and this procedure is documented in Appendix A. For the occasional problem of wrapping or reflection in raw sub-surface temperature data, please refer to Appendix B

3.4 Processing pyranometer data

This section describes the procedure to convert cumulative downward short-wave radiation from argos_main to cumulative daily downward short-wave radiation for publication in the AWS dataset. Cumulative daily downward short-wave radiation is simply the total downward short-wave radiation received since the most recent local solar midnight. An example of a typical yearly curve is shown below: 
An algorithm to do the conversion is set up in the A028-B_1998_2000.sdc spreadsheet in columns G through N. The two top rows of these columns are shown in this figure: 
  .................... G H I J K L M N ....................
   1  .................... Ipy: 204.7  <max    long> 112.22  Ipy cumulative Zero row    13  Ipy edited: ....................
   2  .................... 17.6 222.3 204.7 0 99999.9 11 25.3 99999.9 ....................

To install this algorithm into a new spreadsheet, copy the first 25 rows of these eight columns into the next available blank columns of your Processing spreadsheet. In the same manor as described in the previous subsection, change the reference of the raw data cell G1 to the correct column containing raw pyranometer data in the Preprocessing spreadsheet, and then fill down the eight columns from row 25 to the last row of your Processing spreadsheet.

Notice that the above figure shows two blue cells and one orange cell. The blue cell H1 is the maximum value of pyranometer data allowed when processing in argos_main. You must manually change this number in your new spreadsheet: To find out what the number should be, graph the pyranometer data in argos_main for any year of the station's data you are processing. Find out the y-axis range in argos_main for that graph, and the maximum number of that range is the number to be entered in cell H1. The blue cell J1 is the decimal longitude of the station in degrees east from north. This number controls the time calculated as local solar midnight and it is imperative it is changed for each new station.

The orange cell M1 controls the filtering of pyranometer data for errors. Filtering pyranometer data for errors is difficult, as it cannot be done using variance since the data has a natural high variance due to diurnal oscillations. Neither can the filtering be done using change in variance, because the data has a high change in variance as a result of seasonal variation of solar zenith angle and daily variation of downward solar radiation affected by cloud. The algorithm adopted here searches the data to see if the day's maximum solar irradiance is greater than a median of surrounding maximums by the quantity in the orange cell M1 (set at 13 in the A028-B_1998_2000.sdc spreadsheet). If so, all data for that day are filtered out. This filtering is only effectual for extreme errors in data.

Using this processing, there is a data loss resulting from the limit of the number of years that can be processed in one Starcalc file. You may notice that the first few days of data are lost at the beginning of the pyronometer data series in each file. There are ways around this problem, however they tend to create several other problems in spreadsheets. Therefore, the data loss has been incurred for the present until a need arises to fix the problem.

Be aware that pyranometers in the Antarctic may collect snow, particularly during winter, and consequently strange results by arise. One such case where snow is suspected of resting on the pyranometer at the end of winter, while slowly being melted off by the time of the summer solstice is for LGB20 during 1991 and 1992 shown in the graph below:

3.5 Filtering wind speed and wind direction data

Algorithms to filter wind data are set up in the A028-B_1998_2000.sdc spreadsheet in columns AV through BP. It is important to treat the wind direction and windspeed filtering together, because the wind direction editing depends on wind speed. Wind speed peaks and vector means should be treated in exactly the same manor as for the normal wind speed data, just as wind vector direction should be treated just as wind direction in filtering algorithms.

Wind speed and wind direction data often have errors during winter when temperatures are cold enough to increase the viscosity of lubricants in wind vanes and anemometers. These instruments may also seize as a result snow or ice accumulation on the instruments. A sign of wind vanes and anemometer freeze-up is a sudden jump in the respective data trace, often followed by a zero data trace. That is, the data trend is well in excess of the mean trend of a station's total wind direction or wind speed record, followed by no trend at all. This error is filtered out using similar algorithms to those used for temperature, pressure and humidity data as explained in section 3.3.

For anemometers, another sign of freeze-up is the decay of wind speed with decreasing surface height in excess of the logarthmic decrease expected. This means that as wind speeds recorded at 1m and 2m decrease to less than about 0.8*(wind speed at 4m), the anemometers are beginning to seize, and the data can be treated as erroneous. This is calculated using a roughness length between 0.01mm and 1mm, and takes into account the fact that an AWS is slowly buried with accumulating snow. An explanation of the wind speed filtering for the highest anemometer (3m or 4m) spreadsheet cells is provided here:

Raw wind speed from Preprocessing spreadsheet Orange cell contains mean backward looking standard deviation. Other cells contain backward looking standard deviation of cell + previous 5 cells. Orange cell contains mean forward looking standard deviation. Other cells contain forward looking standard deviation of cell + next 6 cells. This column contains data filtered as for temperature data explained in section 3.3 This column takes data from left hand cells to check for "zero outs" in data. This filters out data immediately prior to or after anemometer freeze up
WM4 : 0.47 2.05
WM4 (ed):
6.1 0.43 0.43 6.1 6.1
6.8 0.45 0.45 6.8 6.8
7.2 0.53 0.53 7.2 7.2
6.8 2.05 2.05 6.8 6.8
5.9 2.01 2.01 5.9 5.9
6.5 0.44 2.07 6.5 6.5
6.9 0.41 2.15 6.9 6.9
7.4 0.49 2.12 7.4 7.4
6 0.52 2.05 6 99999.9
0.9 2.16 2.06 99999.9 99999.9
6.1 2.17 0.54 6.1 99999.9
7.1 2.22 0.98 7.1 7.1

In filtering wind direction data, we can assume that high data variance is most likely at low wind speeds for "light and variable" winds, but for moderate and greater wind speed, lower directional variance is usual. Hence a cuttoff is used for which variable wind direction is accepted. Since the wind direction can pass through 360 degrees under normal operation, it is really the variance of change in wind direction used to filter data, rather than simply wind direction. In other words, a change in wind direction between one observation of 355 degrees and the next of 5 degrees is 10 degrees. An explanation of the wind direction filtering spreadsheet cells is provided here:

Raw wind direction data from Preprocessing spreadsheet Change in wind direction (takes wind going through No rth into account) Orange cell contains the maximum wind speed for which variable wind direction is allowed. Other cells are backward looking standard deviation. Orange cell contains the mean forward looking factor* (standard deviation). Other cells are forward looking standard deviation. Edited wind direction allows any wind direction when wind speed is less than MIN WIND, else uses value in MEAN STDV to assess standard deviations in two orange headed columns to filter similar to temperature data.
WDi : Delta Wdi MIN WIND MEAN STDV WDi (ed):
186 0 2.03 2.03 186
180 6 2.03 2.03 180
175 5 2.03 2.03 175
169 6 2.64 2.64 169
163 6 2.64 2.64 163
158 5 2.13 2.86 158
152 6 0.47 2.97 152
152 0 2.13 2.71 152
146 6 2.19 2.71 146
146 0 2.73 2.71 146

In treating the alorithms for wind direction and wind speed together, it should be noted the there is cross referencing between cells in the wind direction filter and wind speed filters in A028-B_1998_2000.sdc, and so one must copy the first 25 rows of columns AV through BP from that spreadsheet. Note that the columns of the highest-above-surface wind speed are those used in filtering wind direction and closer-to-ground wind speeds. So all other wind filtering algorithms pivot around the 4m (or 3m) wind speed filter routine. If you intend to only use some of the wind data algorithms from A028-B_1998_2000.sdc, be sure the correct cells of the 4m wind speed filter routine are referenced by remaining algorithms after you have made your changes.

Once you have copied the wind filtering algorithm from A028-B_1998_2000.sdc and changed the referencing to the appropriate data fields on the Preprocessing spreadsheet, fill down to the base of your data, as explained in section 3.2. Make sure a duplicate of the edited data is set up in the Postprocessing spreasheet.

3.6 Processing acoustic accumulation data

Use of acoustic sensors on AWSs is in its infancy, and some data received from the sensors have been difficult to interpret: It is necessary to distinguish between what may be be a transient snow drift moving past a station, or may be a reflection off airbourne particulates (snow) in each observation. Furthermore, the speed of sound is affected by ambient air temperature, which itself may change considerably within the air column from the acoustic pinger to the ground as a result of near surface inversions possible in the Antarctic. On top of these complications, add the problem of long term changes in air temperature profiles between the pinger and the surface caused by stations being buried to introduce a gentle bias to the data. It's enough to make you say, "Damn it, just publish the data".

The data must have basic temperature corrections applied, and this is done using the 4m air temperature. A lot of noisy data can also be removed by only accepting observations below a certain windspeed, so that blowing snow is less abundant, the wind speed is set in the spreadsheet as shown below. Temperature corrections to data are done using the formula:

Height(true) = Height(recorded) * sqrt[(273.15+Temperature)/273.15]

An extract of the top 10 rows of the accumulation sensor algorithm from A028-B_1998_2000.sdc is shown in the figure below. In the figure, note that the two orange cells and the columns BV and BW serve the same purpose as the filtering columns for barometric pressure, temperature and relative humidity documented in section 3.3 above. There are also five blue cells in column BZ which must be changed for each new station. The green cells only indicate what data is being used to correct the sensor for air temperature variation and to filter out high wind speed data. These green cells are not changed by the user but will change according to what values are inserted in the five blue cells in column BZ.

    A     .................     BU         BV         BW     BX BY     BZ         CA     CB
1 ................. Acc : 0.17 0.18 Acc (ed): Accumulation is correcting against:

AiT4 (ed):
2 ................. 3.7 0.01 0.02 0.000000000 Initial Height: 3.7

3 ................. 3.68 0.01 0.02 0.010000000 Initial Temp: -21.46

4 ................. 3.68 0.01 0.02 0.020000000 Correct Height: 3.55

5 ................. 3.7 0.01 0.02 0.000000000 Column of Ait4(ed): 21

6 ................. 3.7 0.01 0.01 0.000000000 Column of WD4(ed): 55

7 ................. 3.72 0.01 0.01 0.000000000 Max wind speed: 16

8 ................. 3.72 0.02 0.02 0.010000000 Wind speed is correcting against:

WM4 (ed):
9 ................. 3.72 0.01 0.01 0.020000000

10 ................. 3.74 0.01 0.01 0.000000000

Definitions of the values in the blue cells are:

  • Initial Height: The height at which the sensor was set when it was installed (taken from first observation).
  • Initial Temp: The temperature of the first observation after installation used to correct the initial height. Initial Temp and Initial Height provide the Correct height in the next cell down which is used as the reference for accumulation.
  • Column of Ait4(ed): The number of columns right of column A in which the Ait4(ed) data appears in the Processing spreadsheet. When set correctly, this will automatically change the top green row to Ait4(ed).
  • Column of WD4(ed): The number of columns right of column A in which WM4(ed) data appears in the Processing spreadsheet. When set correctly, this will automatically change the bottom green row to WM4(ed).
  • Max wind speed: The windspeed above which data is not published (data set at 99999.9 in such cases).

You should copy the first 25 rows of columns BU through CB from the A028-B_1998_2000.sdc spreadsheet when using this algorithm, and then fill down only columns BU through BX to the base of your spreadsheet to obtain edited accumulation data (denoted Acc (ed)). Of course, the necessary change to the referenced raw data in the Preprocessing spreadsheet must be made in column BU as described for barametric pressure, temperature and relative humidity documented in section 3.3. Similarly, you must duplicate the Acc (ed) data in the Postprocessing spreadsheet in order to complete the processing of accumulation data.

4. Extract processed ascii data from Staroffice

It is a quick and simple task to get text output from Staroffice. This section provides a checklist to follow each time data is extracted from Starcalc, and mentions a pitfall in obtaining text output from Starcalc. 

Check you have done these things before extracting data from Staroffice:

  • Are the number of rows the same in each of the Preprocessing, Processing and Postprocessing spreadsheets? 
  • Have you changed all blue cells in your spreadsheet to values unique to the station you are processing?
  • Is the start year given in cell A2 of the processing spreadsheet correct? 
  • Is the start month given in cell B2 of the processing spreadsheet correct? 
  • Is the last observation's year and month correct for the station?
  • Are the correct number of data rows in the spreadsheet given in cell E1
Once you are satisfied that all procedures have been followed, click over to the Postprocessing spreadsheet of the station's starcalc file. Make sure all edited data has been duplicated to this spreadsheet as in the example of the A028-B_1998_2000.sdc file. To extract the data as text, goto "save as" (file>save as), and choose the "< Text - txt - csv (StarCalc) >" file type. Also click in the "Edit filter settings" box. Then click "save". When asked about the format of the text output, it is preferable, but not essential, do choose comma separated data. All other options can remain as suggested by Starcalc. You should save to the working directory in which you have been saving all data for the station being processed (see The Starcalc file and directory structure). 

There is a significant problem that Staroffice can sometimes write error messages in output text files, even though no such error exists. This is a software problem that is best fixed by 'getting out of the software and getting in again', in the same manor adopted by software engineers to fix broken cars (joke!). The best way to check if the error has occurred is to open up your text output file and run a search on "Err" in the file. A non-erroneous file will not yield a find. If the problem persists, physically copy the Pyronometer data to the Postprocessing spreadsheet and save as text. Remember NOT to save this altered version as your calculating spreadsheet. Another way of fixing the problem is to change the orange cell of the pyronometer header, and the change it back to the original number. This sometimes works.

You must obtain text output files for each set of data processed in individual starcalc files for the station. LGB20, for example, had five such text output files at the time of writing these instructions: LGB20_1991_1992.txt, LGB20_1993_1995.txt, LGB20_1996_1997.txt, LGB20_1998_2000.txt and LGB20_2001_.txt. 

5. Convert processed ascii data into netcdf format

All AWS data must be published as netcdf files. NETCDF stands for Network Common Data Form and is a platform independent, multiple user standard for publication of scientific data. It is often used in Atmospheric and Oceanic sciences because it avoids a multitude of problems caused by people inventing their own data formats when publishing a dataset.

Converting Starcalc text output discussed in the previous section to netcdf is easy, because a Fortran 95 program exists to do it for you. In fact, it is the same program available for download in the software section of this website under "Software to convert automatic weather station netcdf datafiles to text". Although unknown to the general user, this software can also convert text to netcdf when a "-w" option is specified on the command line. This software is already set up as "antarcticaws" on the Glaciology system and shouldn't have to be installed. Once you have checked the executable exists by typing "antarcticaws" at the unix prompt (you'll get an error message rather than "command not found"), you need to create an input file for your particular station that contains all the essential information on the station. The file is in a special format, and an example of the text input file for GC41 is given here: GC41.input.

In the file, all information after "!" on a line is a comment and will not be read by the software. Information under "&INFO" must be changed to the station name you are processing, the location of the station, the elevation, and the total number of sensors you are publishing with the dataset, respectively. Information under "&VARIABLES" indicates the column order in which stations appear in your text output files after the time column. If the sensor does not exist, set the sensor number to zero. All information under "&VARIABLES" must also be changed for each new station, however all input files have already been created for existing stations, and are kept under their appropriate directories. 

The text datafiles from each Starcalc text output must be concatenated to create a text file of the entire dataset. Here the example of LGB20 is used, and you should repeat this procedure for any stations your are processing. All commands are in unix.

There are five Starcalc text data files for LGB20 at the time of writing called LGB20_1991_1992.txt, LGB20_1993_1995.txt, LGB20_1996_1997.txt, LGB20_1998_2000.txt and LGB20_2001_.txt. These must be concatenated together: 

cat LGB20_199*.txt LGB20_2000_2001.txt >! LGB20.all

which forms the full dataset LGB20.all. This file must be edited so it only contains a header at the top of the file. Open the file and delete all subsequent lines from the top starting "Days since 1980 ...." (there are four such lines to delete in the case of LGB20). It's a two second job and creates a clean ascii record in LGB20.all. It can be converted to netcdf by typing at the prompt in the directory in which you compiled the text-to-netcdf converter:

./antarcticaws -w LGB20.all LGB20.input

The program's output to screen is:

78710 records in this dataset.
Data starts on day 18 1991 and ends day 365 2001
Output file is

If you have errors in your file, the program will usually tell you the problem. You can check the netcdf output, which will be called, by listing the netcdf header information with the following command:

ncdump -h | more

Before copying your netcdf file to the web page, check the data thoroughly, as errors not obvious by poring over the data in a spreadsheet often exist. The best way to do this is to graph the data using the NCAR tool provided at this website, and the next section explains how to use that software to check data and publish it on this website.

6. Publish on the AWS website and graph data with NCAR graphics

This section describes how to graph netcdf data you have created, check it for errors, and then place the data on the AWS website. Before proceeding any further you must install the NCAR graphing package available on the software page. It is assumed you have enough knowledge in unix to do this, but if not, ask a unix guru for help. You need the software set up so you can execute it from anywhere in your home tree. All instructions from here on assume the software, called graphaws, is set up and working. It is already installed on the Glaciology unix system and should just work.

To check the data, plot your netcdf dataset in graphaws as described on the software page. To do a broad check of the data, plot each variable for the entire period (multiple years) on an individual graph. This usually shows up most large errors. You can also plot any three variables for a single year on the same plot, and this is also a good way of doing more detailed error checking. The program gives you the option to output the data in postscript if you would like to print any of the graphs. Once you are satisfied with your dataset, and don't need to fix data by jumping back to step 3, copy the netcdf file to /v/argos/awswebsite/data.

Now construct an HTML file of the station's data to publish on this website. graphaws does this for you. Select to plot a single year of the data, and then the program will ask you if you require and HTML page. Answer "y" for yes. You will be asked what year you want to plot, and you should answer with the first year of the dataset (for LGB20, this would be 1991). A series of questions will be asked about station location and elevation, and the number of years in the dataset, and when you are asked which data to plot, you must choose 4m air temperature as the first graph, barometric pressure as the second graph, and wind speed as the third graph. When giving the ranges of the graphs, choose values that accommodate the entire dataset, not just the first year's data. Upon plotting the data, the program will automatically plot all subsequent years in the dataset, and write a web page in the background which will have the nomenclature "stationname.html". Copy this file to /v/argos/awswebsite

You now need to convert the graphics output to "gif" files. The graphics output is in the NCAR cgm (ncgm) format, and this can be converted to gif files by first splitting the output file, called "gmeta", into individual frames. You do this by typing med gmeta at the command line, and then at the prompt typing split {number of years in dataset}. This will place files called "med{frame number}.ncgm" in your directory. Quit out of med, and then rename each of these files to the name of the station, followed by the year the frame represents. For example, for LGB20, the dataset from 1991 to 2001 would have 11 frames, and so one would enter the following sequence of commands:

med gmeta
split 11
mv med001.ncgm LGB201991
mv med002.ncgm LGB201992
mv med003.ncgm LGB201993
mv med011.ncgm LGB202001

Each of these files is an individual ncgm file, and must be converted to a gif file using the following ncgm2gif script (which can be download by clicking here):

# for printing colour gmeta files

@ count = 0
foreach x($*)

@ count++
ctrans -d sun -res 600x600 $x | rasttopnm | ppmtogif >! $x.gif


echo ${count} 'colour gif files added to the directory'


Continuing with the LGB20 example, the following sequence of commands would be entered to convert the files to gif format:

ncgm2gif LGB20199*
ncgm2gif LGB20200*

The ncgm2gif script places gif files in your directory with the nomenclature {station name}{year}.gif. Copy these gif files to /v/argos/awswebsite/graphs. A postscript version of the gmeta file must also be constructed, but this will contain all the frames in a single file. To do this first move gmeta to a file of the station name. So, for example, LGB20's gmeta file would be renamed LGB20. Now use the ncgm2ps script (which you can download here): 

# for printing colour gmeta files
@ count = 0
foreach x($*)
@ count++
ctrans -d ps.color $x >! $
echo ${count} 'colour postscript files added to the directory'

Type at the command line (again using the LGB20 example): ncgm2ps LGB20 which will provide an output file called This is the printable version of the graphs available at the website, and this should also be copied to /v/argos/awswebsite/graphs.

Now move to the /v/argos/awswebsite directory, and open the data web page datapage.html. You will notice, as you scroll down, that there are java scripts set up in the data table to open windows on each station's data. Make sure one is setup for your new dataset, and also be sure it is named uniquely.

Finally, go to the web page, and check your new dataset can be viewed, and all the files can be downloaded. Tidy up your working directory so others can come after you and follow your work. 

Appendix A: Corrections to raw LGB20 barometric pressure data

The pressure sensor at LGB20 is faulty, and the raw data requires a temperature dependent correction. A correction equation is given in the graph below. To calculate the correction, data was analyzed for the period December 21 1993 to December 31 1995 from LGB35, LGB10-A and LGB20. Linear extrapolation of log(P) values from LGB35 and LGB10-A was used to arrive at a correction in pressure for each LGB20 observation. Each correction was adjusted for changes in the hydrostatic approximation between LGB10-A and LGB20 as a result of temperature differences between the two stations and these values are plotted on the scattergram below. A Least Squares Regression fits this scattergram with the equation dP=-27.834-0.02357*T and this line appears in feint red behind the actual data points (with correlation coefficient r= -0.10602268). This is the correction applied to all LGB20 data in the Staroffice files so that P = Bar-27.834-0.02357*Ait4.
An example of how this is applied in a Starcalc Processing spreadsheet is given in the file /v/argos/awswebsite/LGB20_1993_1995.sdc.Z (uncompress this file before use).  The correction did not work due to long term drift in the pressure values recorded at LGB20. The sensor has considerably more problems than are taken into account by applying a linear correction across all periods: It requires a time dependent correction. This has not been applied to the LGB20 data, and so in the starcalc files for LGB20, all barometric pressures have been set to 99999.9 until a later time.

Appendix B: Problems that may occur with sub-surface temperature data

Sometimes raw sub-surface temperature data "wraps" or "reflects" at certain temperatures. Examples of these problems are shown below. Wrapped data is fixable in staroffice, however reflected data is not. To the best of our knowledge, the latter problem has been fixed, but if errors arise, the example below allow one to recognise the problem if a similar error occurs in your processed data.

B.1 Wrapped data

In the case of wrapped data, the data has reached a minimum (maximum) value, and for some reason, has wrapped to the maximum (minimum) value and continued the data record. An example of is provided here:

The problem can be fixed in Staroffice, and an example of this is provided in /v/argos/awswebsite/LGB20_1993_1995.sdc.Z (uncompress this file before use). The fix for the above problem is shown here:

B.2 Reflected data

In the case of reflected data, as shown in the graph below, the data reaches a certain value, and reflects at that value so that the trend (gradient) of the data is the reverse of what it should be. No easy fix exists for this problem and it must be fixed at the point of argos file processing.

Appendix C: List of all fields published in the dataset (listed by abbreviation)

Air temperature
Subsurface temperature
Wind speed and direction
Relative humidity
Other variables

Appendix D: Ongoing processing notes

1. Berg B9B - Andrew Roberts 11/05/2004

The calibration file downloaded from Kingston has an attribute that does not allow the raw satellite data files to be processed. The final sensor (number 26) should be set to calibration type 1, however it is listed in the download from Kingston as calibration type 3.


new entry


new entry


new entry


new entry


new entry


new entry


new entry

top of page