[RFC 004] Operation Review Comment 17
Organization: JGOFS/DMO
Review of DAP2 operational experience:
Date:� 1 April 2005� (no joke!)
NASA's Earth Science Data Systems Standards Process Group (SPG) is considering the Data Access Protocol, Version 2, (DAP2) for adoption as a community standard. This is the second review of DAP2, this one focusing on its operation experience.� The questions below are provided to guide your feedback. You only need to answer questions applicable to you. Please send comments to ese�rfc�004@spg.gsfc.nasa.gov.�
- Describe in a sentence or two your overall operational experience related to DAP2 (e.g., scientific analysis; science users, server operation; database management; or data translation, etc).� What kinds of DAP2 systems do you have experience with?� (e.g., OPeNDAP netCDF server, OPeNDAP HDF server, OPeNDAP Matlab Toolkit, Ferret, GrADS, and etc).
Answer: using DAP2 to accomplish
all listed: scientific analysis; science
users, server operation; database management; or data translation�
The U.S. JGOFS Data Management Office (http://usjgofs.whoi.edu)
uses DAP2 (OpenDAP) to serve jgofs
format and NetCDF ocean science data and model
results.� Users can access data directly
via OpenDAP-enabled clients (ferret, MATLAB, etc.),
or via Live Access Server (http://usjgofs.whoi.edu/las).�
- How long have you been using DAP2 operationally? �
Answer: Using DAP2 or it�s predecessor (DODS) since 1997
- What types of applications do you use DAP2 systems for?� Are the DAP2 systems applicable to your applications? (e.g., Do they work well with the data types and data manipulations in your application?) ��see above answer
- How many of your applications use DAP2?
i. Total number of applications�� Answer: 3
ii. Percentage of applications�� - not sure (unable to estimate total)
- Why do you choose to use DAP2 systems over other systems for your applications? ��Answer: DODS was developed originally as an extension of the JGOFS DBMS, so it was a natural choice for the US JGOFS DMO.� It has continued to work well, has prompt technical support when it does behave poorly and is a reliable method of sharing/accessing distributed data resources.
- What alternative technologies did you consider?�� none
- Are the DAP2 systems easy to use?� (e.g., Is it hard to learn how to use DAP2 systems?)� Answer: yes very; very brief learning curve
This is huge but difficult to quantify, since clearly I�ve no idea what the �user support workload� would be w/o DAP2. ;)� However, since the US JGOFS DMO is responsible for making the US JGOFS results available to the world, I can imagine the workload of distributing the 20 GB (12000+ data entities) would be overwhelming without an efficient, reliable method of on-demand user access.� On average the server records 12-20 requests per month from different IP addresses (I only count each different IP address one time) for data entities directly from the OpenDAP server.� Additionally, the server averages 200 requests (again, different IP addresses) per month for data entities via Live Access Server (which relies on OpenDAP).
Would 200 requests per month for specific data collections be representative of my user support workload?� Perhaps.� I think the important concept is that OpenDAP is a reliable method of satisfying requests from an average of 200 different users per month.
- Does the performance of the DAP2 systems you have experienced meet your requirements?� (e.g., Does it take a long time to access data in DAP2 systems?)
Answer: yes - performance of DAP2 systems meets my requirements
- Have your bandwidth issues changed since you have been using DAP2 systems? (e.g., Have you seen increased bandwidth requirements because of increased data access or DAP2 overhead, decreased bandwidth requirements because of reduced data volume from subsetting, no impact on bandwidth because of low usage?)� Answer: bandwidth requirements have definitely increased over past 10 years; not sure how I would estimate the DAP2 overhead
- What operational challenges do the DAP2 systems present? (e.g., Does it require advanced processing power, large amounts of memory, complex configuration, etc.? Are the systems easy to deploy and maintain?)
Answer:� running a data server requires �advanced processing power, large amounts of memory, complex configuration�� but this is not because it�s a DAP2 system.� DAP2 itself does not increase the load in a way I can measure.� Since I haven�t tried another delivery mechanism, I can�t compare DAP2 with anything else.
- How has the use of DAP2 affected your systems administration workload?
Answer: trivial impact; I upgrade the app several times a year and only infrequently does this task require more than an hour of my time
- How well do the DAP2 systems scale to large numbers of simultaneous users, or to large datasets?
- How much data does your DAP2 system handle in a typical month?
i. Total data volume�������������������� 20GB
ii. Total number of data files��������� 12000+
iii. Percentage of data volume������� 85%
iv. Percentage of data files 50%
- Can you provide information on user statistics of your DAP2 systems?�
- How many users does your DAP2 system handle in a typical month?
i. Total number of DAP2 users� �� 200
ii. Percentage of your overall users����������� users of what?
- Of the feedback you have received from your users on DAP2, what percentage is positive and what is negative?�� 100% positive feedback.
- How have the user statistics changed over time?
Answer:� increased dramatically; started gathering stats in April 1997 when average DODS access was 0 per month � and no one outside the DMO used it for the entire first year.� In January 2002 the DODS server registered 8 requests, and a total of 250 requests for all of 2002.� Three years later, we average almost that number per month so far in 2005.
I hope these responses are helpful.� I strongly recommend the adoption of DAP2 as a standard protocol.� My experiences with DODS-OpenDAP-DAP2 have been very favorable.� DAP2 is a robust, reliable protocol which greatly simplifies the task of creating federated data systems, both from the user and providers� perspective.