APP下载

Data Retrieval from Xinjiang Astronomical Observatory′s Pulsar Data Archive*

2016-10-27ZhangHailongMarkusDemleitnerWangNaYuanJianpingNieJunWangJie

天文研究与技术 2016年4期
关键词:数据文件海德堡脉冲星

Zhang Hailong, Markus Demleitner, Wang Na,Yuan Jianping, Nie Jun, Wang Jie

(1. Xinjiang Astronomical Observatory, Chinese Academy of Sciences, Urumqi 830011, China, Email: zhanghailong@xao.ac.cn;2. Key Laboratory of Radio Astronomy, Chinese Academy of Sciences, Nanjing 210008, China;3. Heidelberg University, Zentrumfür Astronomie, Mönchhofstr. 12-14, 69120 Heidelberg, Germany)



Data Retrieval from Xinjiang Astronomical Observatory′s Pulsar Data Archive*

Zhang Hailong1,2, Markus Demleitner3, Wang Na1,2,Yuan Jianping1,2, Nie Jun1,2, Wang Jie1

(1. Xinjiang Astronomical Observatory, Chinese Academy of Sciences, Urumqi 830011, China, Email: zhanghailong@xao.ac.cn;2. Key Laboratory of Radio Astronomy, Chinese Academy of Sciences, Nanjing 210008, China;3. Heidelberg University, Zentrumfür Astronomie, Mönchhofstr. 12-14, 69120 Heidelberg, Germany)

Xinjiang Astronomical Observatory (XAO) Pulsar Data Archive currently provides access to 32,290 data files which have been obtained from observations carried out by Nanshan station 25M radio telescope since the year 2000. Both data files and access methods are compliant with the Virtual Observatory (VO) standards and protocols. This paper provides a tutorial on how to make use of XAO Pulsar Data Archive and how to use VO tools as well as on-line interface to visit this data archive; it also describes the data currently stored in the archive, and presents ways in which data can be searched and downloaded.

Pulsar data; Data center; Virtual observatory; Data query

1 Introduction

Xinjiang Astronomical Observatory′s*http://www.xao.ac.cn(XAO) data archive portal*http://data.xao.ac.cnis the primary repository for XAO data products and the main interface to the science user community. XAO Pulsar Data Archive provides authorized access to 32290 files related for observations of pulsars recorded by XAO Nanshan station. The on-line interface is http://data.xao.ac.cn/pul/pulsar/q/form. Not all data is in the public domain yet; the permissions can be divided into two levels. The first level is for data preview, with username “pulsar” and password “astronomy” to login and browse the data, i.e. view the corresponding pulse profile and other detailed information, but one cannot download the original files.The second level authorization allows one to download the raw data files. For the data could only be used with the permission of Dr. Na Wang (na.wang@xao.ac.cn), please send her an email for your request.

The pulsar[1]timing data were obtained by the Nanshan 25M radio telescope. Our observations, which commenced in January 2000, have been made using a dual-channel room-temperature receiver with a bandwidth of 320MHz centered at 1540MHz before June 2002. The de-dispersion was provided by a 2X128X2.5MHz analog filter-bank (AFB). The format of the AFB data is “Timer”. A cryogenic receiver was mounted in July 2002, which increased the sensitivity to 0.5mJy. In January 2010, a digital filter-bank (DFB)[2]system came into operation. The higher time resolution allows us to monitor about 280 pulsars, including ten millisecond-pulsars (MSP). The format of the DFB data is “Psrfit”. The “psrchive”[3]program could read and analyze the data.

2 GAVO

GAVO[4]as the data release framework implemented XAO pulsar data release environment. German Astrophysical Virtual Observatory (GAVO) is German contribution to the IVOA*http://www.ivoa.net/④http://www.g-vo.org(International Virtual Observatory Association), an international effort to create and expand the Virtual Observatory (VO). And GAVO′s services[5]are open to all astronomers as well as the general public④.

The goals of the Virtual Observatory are to allow or improve access to astronomical data of all kinds (astrometry, photometry, spectroscopy, time series, ...) from everywhere in well-defined protocols, let astronomers easily discover, access and use data relevant to their researches, ensure that data does not simply disappear, that it is properly described and can be accessed and understood in the future; it also aims to provide software to help astronomers to use all of this.

3 Archive data sets and format

All of the pulsar data stored in the data archive follow PSRFITS[6]standard. Each file contains a single observation of a pulsar or a particular area of sky. As for the archived data, file names indicate the date and time of the observation. Folded pulsar archives have the file extension ‘.rf’. Calibration source files have the extension ‘.cf’ and observations obtained in search mode have ‘.sf’[7-8].

Digital filter-bank (DFB) system came into operation since January 2010. The main data collection for the archive started in 2010 and has been ongoing. Data have been recorded using an auto-correlation spectrometer, incoherent de-dispersion systems and the digital filter-banks (PDF3). PDFB3 systems directly produce PSRFITS data and we make no changes to the data files for inclusion into the archive. The size range of an individual data file is from 16MB to 1GB and the total size of the data archive is 4TB or even larger.

4 Obtaining the data

Follow the instructions below to access the pulsar query page:

(1) Navigate the web browser to http://data.xao.ac.cn/pul/pulsar/q/form.

(2) From XAO Date Access Portal home page http://data.xao.ac.cn (Fig.1), click “XAO Pulsar Data Query” to access the data release page.

4.1Interface features

The index page (Fig.1) lists published services available through web browsers. A “[P]” in the service listing means that the service is password protected, either because it is too rough for public consumption or because the data providers want exclusive access. In either case, you can contact the site operators to inquire about access.

Fig.2 shows the basic information of XAO Pulsar Data Query and the information will be showed after you click the black triangle before the “[p]”.

More information can be seen by click the “[i]” link. Table 1 shows the fields of pulsar Table and the descriptions.

Fig.1The main page of XAO Date Archive Portal

Fig.2Brief information of XAO Pulsar Data Query service

4.2Query Fields

Numeric expressions—you can recognize those from the little "[?num.expr.]" tag behind them. In addition to raw numbers, you can enter Vizier-like numeric expressions here.

String expressions—these have a little "[?char expr.]" tag and by default match using patterns, evaluating metacharacters like * or ? much like you may know from file name patterns. To force literal matching, prepend your strings with ==. Other operators available for string expressions include caseless matching, string comparisons, or negation.

Table 1 Fields information of Pulsar Table

Date expressions—these are marked with "[?date expr.]" tags. Dates must be given in the ISO format (YYYY-MM-DD). Among the most useful of the supported operators is the range—you can say “2004-01-02 .. 2005-05-01” to specify a range of dates between Jan 2nd, 2004 and May 1st, 2005.

Selection boxes—these are either drop down, in which case you can only select one entry, or open boxes that you can select more than one entry, usually (depending on user interface) using control-click.

Others—application-specific input fields (e.g., cone searches) should come with a short explanation.

4.3Query Modifiers

For most queries, you will have a "Table" query field near the bottom of the form. You can set sorting and limit options there. Note that depending on the query, selecting a column to sort may slow down the answer dramatically. This is because your query may match large amounts of data, and even if only 100 items are returned, potentially millions of them may have been sorted. On the other hand, results overflowing the match limit are not reproducible without a sort option, i.e., you may get a different set of, say, 100 items for identical query parameters at two different times. The services warn you about this fact when they return truncated results.

In cases in which the match limits provided by the form do not suit you, you can override the match limit by editing the result link (see below). You want to substitute your value into_DBOPTIONS_LIMIT=100. The system has hardcoded match limit that you cannot override in this way, but it is unlikely that it will hurt you.

You can usually select an output format. Options here include:

HTML*http://www.w3.org/MarkUp/⑥http://www.ivoa.net/documents/latest/VOT.html—data is returned in your web browser. You can select additional columns in your output from an input field that pops down when you mouse over it.

VOTable⑥——data is returned in IVOA′s standard data format, the VOTable. This is XML that can be human-readable but is really intended to be consumed by tools like Topcat. You can select a "verbosity" specifying the fields present in the output, with 1 standing for a minimal set of information, 2 for what the service author deemed useful for the average astronomer, 3 for (almost) all fields available, and finally H for (essentially) what the default HTML table gives. Furthermore, you can choose between a "human-readable" VOTable (select this to process your data with standard XML or text-based tools or peruse it with the naked eye) and a binary version that you should use for larger data sets since it is much more efficient.

FITS*http://fits.gsfc.nasa.gov/⑧http://www.json.org/js.html—this returns FITS tables. The data is in the first extension. This contains much less meta information than a VOTable of the same data and thus should only be used if your backend tools do not understand VOTables.

TSV—tab separated files. If in a desperate pinch, you can get the table contents as an ASCII file. The fields are separated by tabs. All metadata is lost. Nullvalues are (almost) always rendered as the string "None". Strings containing non-ascii or control characters are rendered with C escapes ( , , etc) or sedecimalunicodecodepoints (xe4, e.g., is an ä). Don′t use this unless you absolutely have to.

JSON⑧—JavaScript Object Notation. It is based on a subset of the JavaScript Programming Language, Standard ECMA-262 3rd Edition-December 1999. JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. These properties make JSON an ideal data-interchange language.

CSV—comma separated values. This format carries almost no metadata as well, but it is understood by many database programs, spreadsheets, etc. (note: don′t use Excel to process astronomical data. TOPCAT is so much nicer, and it has built-in VO support) Null values are mostly rendered as empty fields, but float NULLs are NaNs.

Tar—Usually FITS files of some kind, you can download all matching items in a tar file.

4.4Examples

This section will give three step-by-step procedures to query the data. The first one will provide a Cone Search, the second will show how to use the multiple constraints to do the query and the third one will try to message the query results to Virtual Observatory tools to do data visualization.

(1) Cone Search

Cone Search*http://www.ivoa.net/Documents/PR/DAL/ConeSearch-20070628.htmlis an IVOA protocol which defines a simple query for retrieving records from a catalog of astronomical sources. The query describes sky position and an angular distance, defining a cone on the sky. The response returns a list of astronomical sources whose positions lie within the cone. To do the Cone Search we need three steps:

Step 1.Open XAO Pulsar Data Query website by using a web browser.

Step 2.Input the search criteria in the search windows(shown in Fig.3). RA, DEC coordinate and a search radius are needed. Coordinates (as h m s, d m s or decimal degrees), or SIMBAD-resolvable object.

Step 3.Press the “GO” button (select an output forma and other parameters).

Step 4.Check the results.

(2) Multiple constraints query

Multiple constraints may contain several retrieval conditions. We can define more than one constraint for a given action. When there is more than one constraint for a given action, both constraints are enforced like in the following example constraints. As is shown in Fig.4, there are four retrieval conditions: the first condition is the Target name, the second one is the observation date, the third one is the observation frequency and the last one is the bandwidth. Each constraint is evaluated and for each successful evaluation, the target sets are combined (OR operation) together.

Fig.3Schematic diagram of Cone Search query

Fig.4Schematic diagram of multiple constraints query

(3) Data Visualization

As is shown by Fig.5, we can get 4694 observational data records in 2013 by using “~*2013*” as the search condition in the Observation Date field. The data output format is HTML, and you can get the pulse profile previews when you mouse over the Product key field. Clicking the Preview link can get the large pulse profile image which is generated by “pav-DFTp” command from the PSRCHIVE*http://psrchive.sourceforge.net/software package. Fig.6 shows the large pulse profile.

Fig.5Schematic diagram of pulse profile

Fig.7 shows the analysis results by TopCat*http://topcat.switchinc.org/. TopCat is a Virtual Observatory tool from the AstroGrid software package (http://wwww.astrogrid.org).

Click the “send via SAMP*http://astropy.readthedocs.org/en/latest/vo/samp/” (the upper left corner in the Fig.5) button to send the query data to TopCat. TopCat should be open before you use the “send via SAMP” function. TopCat′s plotting functionality becomes available once the query results have been sent successfully. Fig.7 shows the pulsar positions on the 3D galactic coordinate sphere using the Spherical Plot.

Fig.6Large pulse profile preview by PSRCHIVE

Fig.7Results visualization by Topcat

5 Conclusions

XAO Pulsar Data Archive currently provides access to 32,290 data files.We realized the cone search and multiple constraints query, provided HTML, VOTable, CSV, JSON and tar package output data formats. And the pulse profile previews can be obtained in the HTML output. The data management research results will be applied to large diameter radio telescope in Xinjiang in the future.

Acknowledgement: The algorithm in this paper has applied Taurus High Performance Computing Cluster of Xinjiang Astronomical Observatory, CAS during the testing process.

[1]Wang Na. Pulsar astronomy in China[J]. Chinese Journal of Astronomy and Astrophysics, 2006, 6(2): 1-3.

[2]Hampson G, Brown A. A 1GHz Pulsar Digital Filter Bank and RFI Mitigation system[M/OL]. 2008[2016-01-22]. http://www.jb.man.ac.uk/pulsar/observing/DFB.pdf.

[3]Keith M J. Installation and use of pulsar search software[J]. Astronomical Research & Technology——Publications of National Astronomical Observatories of China, 2012, 9(3): 219-228.

[4]Demleitner M, Neves M C, Rothmaier F, et al. Virtual observatory publishing with DaCHS[J]. Astronomy and Computing, 2014, 7-8: 27-36.

[5]Demleitner M, Gufler B, Kim J, et al. The German Astrophysical Virtual Observatory (GAVO): archives and applications, status and servicesm[J]. Astronomische Nachrichten, 2007, 328(7): 713.

[6]Hotan A W, van Straten W, Manchester R N. PSRCHIVE and PSRFITS: an open approach to radio pulsar data storage and analysis[J]. Publications of the Astronomical Society of Australia, 2004, 21(3): 302-309.

[7]Hobbs G, Miller D, Manchester R N, et al. The Parkes observatory pulsar data archive[J]. Publications of the Astronomical Society of Australia, 2011, 28(3): 202-214.

[8]Khoo J, Hobbs G, Manchester R N, et al. Using the Parkes pulsar data archive[J]. Astronomical Research & Technology——Publications of National Astronomical Observatories of China, 2012, 9(3): 229-238.

国家重点基础研究发展计划 (973计划) (2015CB857100);国家自然科学基金 (U1531125,11503075);中国科学院青年创新促进会;西部之光项目 (XBBS201325);天文学科技领域云项目 (XXH12503-05-05);中国科学院天文台站设备更新及重大仪器设备运行专项经费支持.

2016-02-22;修定日期:2016-03-14

张海龙,男,博士. 研究方向:数据密集型研究. Email: zhanghailong@xao.ac.cn

新疆天文台脉冲星数据检索

张海龙1,2,Markus Demleitner3,王娜1,2,袁建平1,2,聂俊1,2,王杰1

(1. 中国科学院新疆天文台,新疆 乌鲁木齐830011; 2. 中国科学院射电天文重点实验室,江苏 南京210008;3. 海德堡大学天文研究中心,海德堡69120,德国)

新疆天文台目前已归档32 290条脉冲星观测数据文件,脉冲星数据检索平台提供南山观测站25 m射电望远镜自2000年以来获得的近300颗脉冲星的观测数据检索服务。数据文件和检索、访问方法符合虚拟天文台标准和协议。介绍了如何利用新疆天文台脉冲星数据检索平台获取数据,如何利用虚拟天文台相关工具对数据进行简单处理,及锥形检索、多约束目标检索方法的使用。

脉冲星数据;数据中心;虚拟天文台;数据检索

P161

A

1672-7673(2016)04-0473-08

CN 53-1189/PISSN 1672-7673

猜你喜欢

数据文件海德堡脉冲星
脉冲星方位误差估计的两步卡尔曼滤波算法
大咖云集
基于表空间和数据文件探讨MIS中数据库架构设计
宇宙时钟——脉冲星
基于虚拟观测值的X射线单脉冲星星光组合导航
长征十一号成功发射脉冲星试验卫星
基于网络环境的社区协同办公问题探讨(二)
海德堡古城堡,那抹王者之气的沧桑美
现场
德国海德堡晚霞