adfread — ADF TAPE21.asc Parser

Overview

This module provides the AdfParser class for reading ADF (Amsterdam Density Functional) TAPE21.asc files — the ASCII export format of the ADF binary checkpoint file TAPE21.

The parser converts the structured plain-text representation into a nested Python dictionary keyed by group and variable name, and can write a human-readable dump of the parsed data.

The module can also be run directly as a command-line utility:

$ python adfread.py results.asc

This produces results.txt containing a formatted dump of all data.

Module-level constants

adfread.INT_FIELD_WIDTH: int — Fixed column width (in characters) used for integer fields: 12.

adfread.FLOAT_FIELD_WIDTH: int — Fixed column width (in characters) used for floating-point fields: 28.

adfread.STRING_BLOCK_SIZE: int — Block size (in characters) for string entries: 160.

AdfParser class

class adfread.AdfParser(filename)

Parser for ADF TAPE21.asc files.

Parameters:: filename (str or pathlib.Path) – Path to the .asc file to parse.

Instance attributes

filename: pathlib.Path: Resolved path to the input file.

lines: list of str: Raw lines of the file as loaded by load(). Empty until load() or parse() is called.

data: dict[str, dict[str, Any]]: Nested dictionary of parsed values, populated by parse(). The outer key is the group name; the inner key is the variable name; the value is a numpy.ndarray (integers or floats), a list (strings or booleans), depending on the type code in the file.

Static methods

static _float_x(x)

Convert an ADF-formatted floating-point string to a Python float.

Handles ADF-specific quirks such as bare exponents starting with 'E' (missing mantissa) and negative-exponent notation.

Parameters:: x (str) – Raw field string from the file.
Returns:: Parsed float value.
Return type:: float

static _split_n(s, n)

Split a string into fixed-length chunks of exactly n characters, stripping surrounding whitespace from each chunk.

Parameters:

s (str) – Input string (typically one raw file line).
n (int) – Chunk width in characters.

Returns:

List of stripped substrings.

Return type:

list of str

static _int_x(s)

Convert an ADF-formatted integer string to a Python int.

ADF uses the sentinel '**********' to represent integer overflow (values that do not fit in a signed 32-bit integer). Such values are mapped to -(2**31).

Parameters:: s (str) – Raw field string from the file.
Returns:: Parsed integer value or -(2**31) for the overflow marker.
Return type:: int

Methods

load()

Read the file into lines.

Uses latin-1 encoding to faithfully handle any byte values that may appear in ADF output.

Raises:: SystemExit – If the file does not exist or is a directory; an error message is printed and the process exits with code 1.

parse()

Parse all data from lines into data.

Calls load() automatically if lines is empty.

File structure understood by the parser:

Each variable is represented by three consecutive logical records:

Group name — a section identifier string (e.g. 'Geometry', 'SCF').
Key name — the variable name within the group.
Descriptor line — three whitespace-separated integers: len1, len2, typ.
- len2 — number of values to read.
- typ — type code:
  
  Code
  
  Type
  
  Storage
  
  1
  
  Integer
  
  numpy.ndarray, dtype int
  
  2
  
  Float
  
  numpy.ndarray, dtype float
  
  3
  
  String
  
  list of str
  
  4
  
  Boolean
  
  list of bool
- len1 — when 0, one additional blank line follows the data block and is consumed.
Data block — one or more lines holding len2 values encoded in fixed-width fields (INT_FIELD_WIDTH, FLOAT_FIELD_WIDTH, or STRING_BLOCK_SIZE).

Returns:

The populated data dictionary.

Return type:

dict[str, dict[str, Any]]

Raises:

ValueError – If an unknown type code (not 1–4) is encountered.
Exception – Re-raises any parsing exception after printing a context window of ±3 lines around the failing position.

write_dump(outfile)

Write a human-readable text dump of data to outfile.

Calls parse() automatically if data is empty.

The output format is:

<group name>
  <key> = <value>
  <long_key> = {<count>}
      <value line 1>
      <value line 2>
      ...

Multi-line values (those whose string representation contains a newline) are printed in the indented block form shown above.

Parameters:: outfile (str or pathlib.Path) – Destination file path.

Private parsing helpers

The following methods are used internally by parse() and share the same calling convention:

_parse_integers(start, count)

Read count integers beginning at line start of lines, consuming as many lines as needed (using INT_FIELD_WIDTH column width).

Parameters:

start (int) – Starting line index.
count (int) – Number of values to read.

Returns:

(values, next_line_index)

Return type:

tuple(numpy.ndarray, int)

_parse_floats(start, count)

Read count floats beginning at line start, using FLOAT_FIELD_WIDTH column width.

Parameters:

start (int) – Starting line index.
count (int) – Number of values to read.

Returns:

(values, next_line_index)

Return type:

tuple(numpy.ndarray, int)

_parse_strings(start, count)

Read count characters of raw text starting at line start and split into STRING_BLOCK_SIZE-character blocks, stripping trailing whitespace from each block.

Parameters:

start (int) – Starting line index.
count (int) – Number of characters to consume.

Returns:

(values, next_line_index)

Return type:

tuple(list of str, int)

_parse_bools(start, count)

Read count boolean values beginning at line start. Each character on a line is decoded as 'T' → True or 'F' → False.

Parameters:

start (int) – Starting line index.
count (int) – Number of values to read.

Returns:

(values, next_line_index)

Return type:

tuple(list of bool, int)

_parse_value(typ, start, count)

Dispatch to the appropriate _parse_* helper based on typ.

Parameters:

typ (int) – Type code (1 = int, 2 = float, 3 = str, 4 = bool).
start (int) – Starting line index.
count (int) – Number of values to read.

Returns:

(values, next_line_index)

Return type:

tuple(Any, int)

Raises:

ValueError – If typ is not in {1, 2, 3, 4}.

Usage examples

Parsing a file programmatically

from adfread import AdfParser

parser = AdfParser('TAPE21.asc')
data = parser.parse()

# Access a specific group and variable
atom_coords = data['Geometry']['xyz']   # numpy.ndarray of floats
atom_types  = data['Geometry']['atomtype']

print(atom_coords.reshape(-1, 3))

Writing a human-readable dump

from adfread import AdfParser

parser = AdfParser('TAPE21.asc')
parser.write_dump('TAPE21.txt')

Iterating over all groups and keys

from adfread import AdfParser

parser = AdfParser('TAPE21.asc')
data = parser.parse()

for group, variables in data.items():
    for key, value in variables.items():
        print(f'{group} / {key}: shape={getattr(value, "shape", len(value))}')

Command-line usage

$ python adfread.py TAPE21.asc
# Produces TAPE21.txt

# Error — wrong extension:
$ python adfread.py results.dat
Error: input file must end with .asc

Command-line interface

When invoked as a script the module accepts a single positional argument:

python adfread.py <file.asc>

Condition	Behaviour
No argument supplied	Prints usage and exits with code 1.
File extension is not `.asc`	Prints an error and exits with code 1.
File not found or is a directory	Prints an error and exits with code 1.
Success	Writes `<stem>.txt` alongside the input file and exits with code 0.

Dependencies

Package	Usage
`numpy`	Storage for parsed integer and float arrays.
`re`	Regular-expression preprocessing of ADF float strings in `AdfParser._float_x()`.
`pathlib`	Platform-independent path handling.
`sys`	Reading command-line arguments and controlled process exit.
`typing`	Type annotations (`Any`, `Dict`, `List`, `Union`).

Code	Type	Storage
`1`	Integer	`numpy.ndarray`, dtype `int`
`2`	Float	`numpy.ndarray`, dtype `float`
`3`	String	`list` of `str`
`4`	Boolean	`list` of `bool`