The automated acquisition of mobilely acquired geophysical data accompanied by real-time data processing and real-time visualization is replacing pedantic methods of acquiring, analyzing, and reducing data [Connor et al. 1997; Gumert 1997; Aiken et al. 1997]. Fast, asynchronous, multi-threaded, modular systems are replacing traditional linear codes. These technological innovations have resulted in significantly better data quality for the geophysical research community. Improved data quality is particularly important for field scientists such as geophysicists, oceanographers, glaciologists, and atmospheric physicists, who take scientific investigations to remote, sometimes hostile environments. Work in such environments places data collection at high priority.
This thesis uses the current capabilities of a real-time mobile data acquisition system to develop a real-time data visualization and quality-control monitoring sub-system. Remodeling the current data routing processes results in a new real-time data distribution process which exploits multiple concurrent threads communicating asynchronously via message passing and message queues. The end product is an efficient real-time network- distributed data-routing suite of programs that increases packet throughput while decreasing packet latency and jitter at the multiple visualization sub-systems. These programs were specifically developed to assist the Support Office for Aerogeophysical Research (SOAR) help to reduce the human impact upon a fragile environment by requiring fewer people to evaluate data quality, monitor the instrumentation, and identify the significant data anomalies.
SOAR is funded by the National Science Foundation's Office of Polar Programs. The goal SOAR is to provide research scientists with high quality aerogeophysical observations of the Antarctic Ice Sheet [Magsino et al. 1998]. To achieve this mission, SOAR maintains a Twin Otter aircraft (henceforth referred to as SJB), specially modified to carry a combination of geophysical and computer instrumentation, for flying geophysical surveys over Antarctica. Data collected during these surveys is being used to determine the influences of underlying geology on ice-sheet dynamics, primarily to assess the potential for ice sheet collapse and subsequent global sea-level rise [Blankenship et. al. 1993, Bell et al. 1998]. Antarctic surveying by SOAR calls for a science team of data collectors to spend long airborne hours collecting large quantities of digital geophysical data. Back at the base station, this flight team hands the geophysical data over to the base station crew, who must spend monotonous hours downloading data, preprocessing data for storage, and generating quality control data plots and statistical charts for yet another science team to analyze. An arduous process to say the least. Careful science requires a review of the quality control plots and statistical analyses to determine the validity of the geophysical data (i.e. were the instruments operating correctly?). In the event that data quality falls below an acceptable level, decisions to recollect or nullify the data set are made long after the data has been collected.
Consider a highly automated scenario. Like the previous science team, data-collectors are required to spend long airborne hours collecting geophysical data. In addition to acquiring data, the computer system operated by the science team is programmed to process the data during acquisition, formulate quality control information, and provide real-time data visualizations. Data quality and instrument operation are monitored as the acquisition process proceeds. In the event of instrument failure, the flight team knows when, and possibly why, the failure occurred. Steps can be taken to correct the operation of the faulty instrument. The data acquisition team arrives back at the base station with a set of digital results of known quality, ready to be reviewed and modeled by researchers and scientists.
The simple advantages of real-time signal processing and real-time visualization promise higher quality data sets for this scientific community. Real-time computing allows faster diagnoses of equipment failures during acquisition and reduces the arduous task of post-processing data. For high-overhead operations, such as geophysical surveying in Antarctica, a more automated system reduces the number of personnel required for fieldwork and decreases the quantity of computer equipment utilized in the field. Fewer people in the field and reduced equipment needs result in lower operation costs and less impact on a largely pristine environment. Considering the scope and importance of geophysical research in Antarctica, establishing optimized real-time programming and visualization methods is crucial for continued scientific progress.
Early attempts at collecting geophysical data resulted in sparse data sets, much arm waving and grand assumptions based on inadequate data. Calculations performed using these sparse data sets were susceptible to large errors [Jackson 1972; Kleiner and Graedel 1980]. The advent of the global positioning system (GPS) automated the collection of large volumes of digital data by quickly and accurately positioning the measurements. Utilizing the dense spatial data sets collected with SOAR’s acquisition aircraft, numerous complexities about the Antarctic Ice Sheet have been uncovered. A buried volcano has been discovered supplying an underlying heat source to an already unstable ice surface [Blankenship et al. 1993]. Ice stream calculations of surface velocity, ice thickness, strain rates, and mass balance have revealed details about the rate at which the West Antarctic Ice Sheet is releasing its mass to the ocean [Bindschadler et al. 1996].
Other geoscientists are using real-time, mobile, distributed networks to automate geophysical data collection. Hurricane hunters have put together a real-time data acquisition/visualization system using the Linux operating system and a scanning radar altimeter (SRA) that measures directional wave spectra and storm surge in real-time [Wright and Walsh, 1999]. Their system will be used to refine hurricane models and improve storm-forecasting techniques. The SRA writes formatted data to a shared memory block where other concurrently executing processes access the data. Real-time displays are generated using Tcl/Tk and the Blt graphics library. Real-time airborne remote sensing using the SRA and its predecessor, the surface contour radar, SCR, has been previously used to study the sea surface and develop wave models for examination of near-shore circulation and beach erosion [Walsh et al. 1985, 1989, and Hwang et al. 1998].
One example of a shipboard data acquisition, data processing, and analysis effort is an integrated hardware and software system developed by the National Aeronautics and Space Administration (NASA) and the Naval Oceanographic Office (NAVOCEANO) [Miller et al. 1995, Davis et al. 1995]. This real-time measurement system coordinates multiple serial-output from oceanographic instruments and utilizes their data to determine the propagation of light through the water column. Underwater optics affects the performance of communications systems, weapons, optical surveillance techniques, search and recovery, and coastal mapping. The system employs a uniform, defined packet format for identifying, transferring, and storing data. A visualization screen is available for displaying real-time data acquisition or a playback of previously acquired data sets.
Mars exploration teams are employing the use of mechanical rovers, acting as artificial field geologists, that collect images of, and interact with, surface materials on the Martian surface [Stoker 1998, Arvidson et al. 1998, and Matijevic et al. 1997]. These computer driven systems, controlled and monitored remotely, require efficient onboard algorithms to filter collected data before transmittal back to Earth, where rapid decisions must made in response to the results received. Knowing the location and heading of the rover, and the pointing direction of onboard instrumentation, is essential during operational missions. High quality real-time imaging and pattern recognition are necessary for remote object identification and distance determination to aid in rover guidance.
As a final example, I have previously worked with geophysicists to develop a simple, robust system for visualizing and processing magnetic data in real time. Magnetic anomaly maps are utilized when assessing an area for volcanic hazards. Using the Linux operating system, an ASCII stream of survey data (i.e. a time-stamped GPS data stream and magnetic reading from a magnetometer) is radio-telemetered, by a mobile survey team, to a real-time visualization station where the GPS and magnetic data are associated and processed for map plotting and then used for visualization. The freely available XForms package is used to track, in real-time, the survey team’s location and the acquired magnetic readings. Real-time visualization of this mobilely acquired magnetic data has proved to be an effective tool for rapidly identifying magnetic anomalies and mitigating difficulties that frequently accompany geophysical surveying [Connor and Connor, 1999].
Arguably, in the future, all geophysical exploration will rely on the use of real-time mobile distributed networks. This technology, as these few examples show, can result in improved data quality control, faster data collection, and higher resolution of geophysical data sets. This transformation from a discipline that relies on interpretation of sparse data sets, to one that uses high quality, dense sampling, will be driven by computer science.
Previous to this application, computer science research has not directly addressed the various real-time data acquisition, processing, and visualization problems in geophysical surveying. Nonetheless, a review of current computer science literature reveals various areas of research applicable to high speed data acquisition, transmission, and real-time visualization in geophysics. Areas of relevant computer science research include: video and audio transmission over the internet, prototyping methods for designing real-time systems, the use of process graphs for specifying real-time system requirements, and techniques for evaluating packet routing performance over the internet. These areas of research and their relevance to geophysical data acquisition onboard SJB is reviewed in the remainder of this section.
The High Performance Distributed Computing project (HiPer-D) has been developing a prototype for a distributed, real-time, shipboard Command and Control system for the U.S. Navy [Swick et al. 1996, Welch et al. 1996, Welch et al. 1998]. Two types of distributed data are discussed, control data and streaming data. Streaming data is analogous to live audio or video transmissions, comparable to the data streams acquired aboard SJB. System requirements include reliability (e.g. FIFO ordering of streaming data and reliable delivery of control commands), efficiency (e.g. low latency and low inter-arrival jitter), concurrency (i.e. multiple threads for maximizing throughput), fault tolerance (e.g. no single point of failure), and real-time support throughout the network for guaranteeing end-to-end networked data transfers. They are looking for a workable solution that satisfies all of their communication requirements.
Designing a real-time application begins with analyzing and characterizing current real-time systems and then prescribing methodologies that illustrate system operation to developers and users. Processing graphs are being used to model and analyze real-time data flow in signal processing systems, providing a formal specification and knowledge about end-to-end latencies, memory requirements, schedulability, and real-time execution rate [Goddard and Jeffay 1996, 1997, 1998, 1999]. Utilizing these general processing graphs one can predict real-time requirements for signal processing applications. An executable, rapid prototyping environment is another design method proposed to reduce the cost and development time of real-time applications. One example, REAR (Rapid Prototyping Environment for Advanced Real-Time Systems) [Fischer et al. 1998, Stephan et al. 1999], utilizes a global PCI bus to connect a system of real-time processing units. These units communicate via asynchronous signals and message-based IPC, using a control module for sending and receiving messages, message queues for communication between units, and multiple concurrent threads for message processing. These are the same elements encapsulated within the real-time packet routing mechanism designed in this thesis for the distributed, data packet transport aboard SJB.
Time-sensitive continuous media (CM) data, such as audio and video streams, is currently commanding much attention within the computer science research community [Padhye et al. 1999, Parris et al. 1999 and 1998, Rubenstein et al. 1999, Sen et al. 1999, Salehi et al. 1998, Sahu et al. 1998]. Since high-speed real-time data acquisition streams are similar to CM data, current research addressing problems associated with interactive multimedia over the Internet provide parallels to high-speed real-time data acquisition and visualization systems. Congestion control protocols for CM data try to achieve fairness by fairly sharing congested Internet bandwidth with non-multimedia applications. More specifically, these protocols are used in systems where Transmission Control Protocol (TCP) packet streams (i.e. applications using standard TCP congestion avoidance protocols) coexists with User Datagram Protocol (UDP) packet streams (or those applications using non-standard ways of recognizing network congestion). Padhye et al. [1999] have proposed a flexible rate-adjusting congestion control protocol as a first step towards a comprehensive model of congestion control for time-sensitive data streams. Active queue management in routers is another proposed way of dealing with congestion on the Internet. Random Early Detection (RED), Flow Random Early Detection (FRED), and Drop Preference Management (DPM) or Class-Based Thresholds (CBT) are active queue management mechanisms, each improving some weakness from the preceding scheme. The CBT and DPM mechanisms are designed specifically for interactive multimedia streams [Parris et al. 1998, 1999]. These protocols tag a CM data stream before it reaches a router and constrain throughput and minimize latency during times of congestion, by limiting the average number of CM packets enqueued, and also by preferentially dropping the oldest packet already queued.
The packet routing scheme designed in this thesis to be used aboard SJB uses a router-like process that accepts data packets and writes them to individual message queues, for transport via an Ethernet to visualization processes. Controlling congestion within a QNX network, similarly, involves process scheduling and queue management. A strictly QNX-based distributed network has the ability to operate below the TCP/UDP level using a message passing form of IPC that transparently operates between processes on a single machine and between processes executing on different nodes within the network [QNX 1998]. Traditional socket-based Inter-process Communication (e.g. TCP and UDP protocols) are implemented in QNX using this message-passing scheme [QNX 1997]. By operating below the TCP/UDP level, QNX eliminates the overheard (i.e. latency) associated with these protocols but must handle similar problems associated with identifying and addressing congestion within the network when data acquisition and packet transfer rates are high.
Data transfer using multicast groups is another research area applicable to high speed data acquisition and visualization. One problem encountered is how to control the transfer rates of bulk data to multiple heterogeneous receivers. This is analogous to multiple visualization stations receiving data from a single data acquisition process. One approach towards controlling data transmission rates to multiple receivers has the sender concurrently transmitting to multiple multicast groups at different rates with the idea that receivers can collect data from the group that provides the best rate for that receiver [Bhattacharyya et al. 1998]. Rubenstein et al. [1999] identify four fairness properties and formally demonstrate the potential fairness benefits gained by allowing multicast sessions to be multi-rate. Smoothing reduces the variability seen in bit rates (i.e. burstiness) at the receiving end [Salehi et al. 1998]. By combining smoothing with differential caching (i. e. intermediate buffering) differences in incoming and outgoing transmission schedules can be accommodated [Sen et al. 1999].
The final research area reviewed regards network performance analysis and measurements of network dynamics. One study evaluates loss correlation for queues with bursts of input traffic [Schulzrinne et al. 1992]. Loss correlation was measured using the conditional loss probability (CLP) metric, defined as the conditional probability that a packet from a data stream is lost based on its immediate predecessor also being lost. Traffic patterns were found to have a strong influence on loss correlation, while buffer size had virtually no influence.
A particularly relevant study by Yajnik et al. [1996] utilized the MBone multicast network to examine the occurrences of packet loss among 12 world-wide recipients participating in a multicast session. Two interesting observations include the significant amount of burst loss (i.e. consecutive packet loss) events lasting from a few seconds up to three minutes, and the periodic packet loss events lasting approximately .6 sec and occurring at 30 second intervals [Yajnik et al. 1996]. Burst loss events and periodic loss events were identified during packet transmissions over the QNX network being tested in this thesis. Another experiment measured packet loss and throughput at the application level using the very-high-speed Internet2 network. The Internet2 was found to be access-constrained (by full packet queues at routers), indicating network performance was influenced more by packet-rate (i.e., number of packets) than by bit-rate (i.e., packet size) [Clark and Jeffay 1999]. Increasing packet delay may act as a predictor of packet loss such that adaptive CM applications could take anticipatory action based on its occurrence [Moon et al. 1998]. The temporal dependence in packet loss data is revisited in a later study [Yajnik et al. 1999] where Internet packet loss is measured and then modeled, but again the results are inconclusive. More performance analyses are required to better understand observed packet loss anomalies and their spatial and temporal interdependence. This thesis measures packet loss seen within a QNX real-time distributed network during simulated high-speed data acquisition and attempts to understand and moderate its affect on real-time visualization performance.
This thesis is motivated by the geoscience research community searching for state-of-the-art computer science solutions for resolving real-time data acquisition problems. Computer science expertise is indispensable in efficiently designing, optimizing, analyzing, and visualizing real-time, high-speed, data acquisition systems. The first part of this thesis includes a description of the acquisition system currently employed by SOAR. QNX, a real-time Unix-based operating system, drives the data acquisition and visualization processes. Nine data collecting instruments and six computers are installed onboard the surveying aircraft allowing it to fly over Antarctic terrain and collect data used for geologic assessment of the surveyed areas.
Instrumentation on the plane monitored by real-time visualization include:
A ground-based magnetometer station was also controlled and monitored using a real-time display.
The purpose of real-time visualization is to monitor the data acquisition process and quickly discover instrumentation problems. Simultaneously, real-time visualization informs users of data quality and speeds the recognition of significant geophysical anomalies that are discovered and interpreted from the acquired data. The aircraft computer system operates as a distributed QNX network. One acquisition computer and four display computers are connected linearly via coaxial cabling.
The QNX operating system utilizes a modular, microkernel design and was chosen for data acquisition and real-time visualization because of its real-time capabilities, distributed network design, fast context switching, excellent hardware interrupt support, small footprint, and true multi-tasking. QNX inter-process communication is based on message passing which is used extensively during data acquisition and visualization. A short description is given of the data acquisition processes, but greater attention is devoted to the development of a real-time visualization and monitoring screen designed to provide real-time instrument feedback for the aircraft operators.
The acquisition system and real-time display system run as separate QNX processes on separate computers. All data streams entering the acquisition system are saved as stream-tagged, time-stamped, data packets. Data packets have a consistent structure. A real-time display process requests data packets from the acquisition process and extracts the information to be displayed. Data acquired through serial ports (all but the radar data) are displayed as constantly moving strip charts that keep a constant data range but automatically adjust chart boundaries based on immediate values. These device-level charts attempt to display data values from each data packet received from the acquisition system.
The instruments' data collection rates vary from 1 Hz to 10 Hz. The overall movement of each strip chart remains constant to allow for the comparison of data values at similar collection times. The purpose of the device-level charts is to quickly indicate equipment failures. Besides the device-level charts, experiment-level strip charts reveal long-term data trends and data anomalies by filtering each of the data streams and updating their respective data charts at regularly timed intervals. This screen was field-tested during SOAR’s 1998-99 eight-week geophysical survey season in Antarctica and proved extremely useful at highlighting equipment malfunctions and outside-of-system interference. The aircraft equipment operators aboard SJB also appreciated the chance to view real-time progress of the survey in addition to gaping at the truly awesome TransAntarctic mountain scene below. In time, real-time visualization of geophysical data may help to reduce the number of necessary Antarctic field assistants by requiring fewer people to evaluate data quality, monitor the instrumentation, and identify significant data anomalies.
During real-time visualizations, the field tests uncovered inefficient data packet routing to the display stations. The second part of this thesis describes an improved data routing scheme using message queues, asynchronous message passing and multiple concurrent threads. Three processes are created for routing packets, a Spool process for receiving acquired packets from the acquisition system, a Disk-writer process for recording data packets to permanent storage, and a Coordinator process in charge of communicating with the real-time visualization stations and collecting acquired data packets, from a message queue filled by Spool, to transfer, via an Ethernet, to waiting visualization processes. The Coordinator creates an individual Communication-thread and message queue for routing packets to each real-time display process. A performance comparison reveals the original routing mechanism’s weaknesses and demonstrates the advantages gained by utilizing concurrency and asynchronous communication. Implementing the improved packet routing scheme overcomes the shortcomings of the original system and allows for future increases in data acquisition rates.