adfread — ADF TAPE21.asc Parser

Overview

This module provides the AdfParser class for reading ADF (Amsterdam Density Functional) TAPE21.asc files — the ASCII export format of the ADF binary checkpoint file TAPE21.

The parser converts the structured plain-text representation into a nested Python dictionary keyed by group and variable name, and can write a human-readable dump of the parsed data.

The module can also be run directly as a command-line utility:

$ python adfread.py results.asc

This produces results.txt containing a formatted dump of all data.

Module-level constants

adfread.INT_FIELD_WIDTH

int — Fixed column width (in characters) used for integer fields: 12.

adfread.FLOAT_FIELD_WIDTH

int — Fixed column width (in characters) used for floating-point fields: 28.

adfread.STRING_BLOCK_SIZE

int — Block size (in characters) for string entries: 160.

AdfParser class

class adfread.AdfParser(filename)

Parser for ADF TAPE21.asc files.

Parameters:

filename (str or pathlib.Path) – Path to the .asc file to parse.

Instance attributes

filename: pathlib.Path

Resolved path to the input file.

lines: list of str

Raw lines of the file as loaded by load(). Empty until load() or parse() is called.

data: dict[str, dict[str, Any]]

Nested dictionary of parsed values, populated by parse(). The outer key is the group name; the inner key is the variable name; the value is a numpy.ndarray (integers or floats), a list (strings or booleans), depending on the type code in the file.

Static methods

static _float_x(x)

Convert an ADF-formatted floating-point string to a Python float.

Handles ADF-specific quirks such as bare exponents starting with 'E' (missing mantissa) and negative-exponent notation.

Parameters:

x (str) – Raw field string from the file.

Returns:

Parsed float value.

Return type:

float

static _split_n(s, n)

Split a string into fixed-length chunks of exactly n characters, stripping surrounding whitespace from each chunk.

Parameters:
  • s (str) – Input string (typically one raw file line).

  • n (int) – Chunk width in characters.

Returns:

List of stripped substrings.

Return type:

list of str

static _int_x(s)

Convert an ADF-formatted integer string to a Python int.

ADF uses the sentinel '**********' to represent integer overflow (values that do not fit in a signed 32-bit integer). Such values are mapped to -(2**31).

Parameters:

s (str) – Raw field string from the file.

Returns:

Parsed integer value or -(2**31) for the overflow marker.

Return type:

int

Methods

load()

Read the file into lines.

Uses latin-1 encoding to faithfully handle any byte values that may appear in ADF output.

Raises:

SystemExit – If the file does not exist or is a directory; an error message is printed and the process exits with code 1.

parse()

Parse all data from lines into data.

Calls load() automatically if lines is empty.

File structure understood by the parser:

Each variable is represented by three consecutive logical records:

  1. Group name — a section identifier string (e.g. 'Geometry', 'SCF').

  2. Key name — the variable name within the group.

  3. Descriptor line — three whitespace-separated integers: len1, len2, typ.

    • len2 — number of values to read.

    • typ — type code:

      Code

      Type

      Storage

      1

      Integer

      numpy.ndarray, dtype int

      2

      Float

      numpy.ndarray, dtype float

      3

      String

      list of str

      4

      Boolean

      list of bool

    • len1 — when 0, one additional blank line follows the data block and is consumed.

  4. Data block — one or more lines holding len2 values encoded in fixed-width fields (INT_FIELD_WIDTH, FLOAT_FIELD_WIDTH, or STRING_BLOCK_SIZE).

Returns:

The populated data dictionary.

Return type:

dict[str, dict[str, Any]]

Raises:
  • ValueError – If an unknown type code (not 1–4) is encountered.

  • Exception – Re-raises any parsing exception after printing a context window of ±3 lines around the failing position.

write_dump(outfile)

Write a human-readable text dump of data to outfile.

Calls parse() automatically if data is empty.

The output format is:

<group name>
  <key> = <value>
  <long_key> = {<count>}
      <value line 1>
      <value line 2>
      ...

Multi-line values (those whose string representation contains a newline) are printed in the indented block form shown above.

Parameters:

outfile (str or pathlib.Path) – Destination file path.

Private parsing helpers

The following methods are used internally by parse() and share the same calling convention:

_parse_integers(start, count)

Read count integers beginning at line start of lines, consuming as many lines as needed (using INT_FIELD_WIDTH column width).

Parameters:
  • start (int) – Starting line index.

  • count (int) – Number of values to read.

Returns:

(values, next_line_index)

Return type:

tuple(numpy.ndarray, int)

_parse_floats(start, count)

Read count floats beginning at line start, using FLOAT_FIELD_WIDTH column width.

Parameters:
  • start (int) – Starting line index.

  • count (int) – Number of values to read.

Returns:

(values, next_line_index)

Return type:

tuple(numpy.ndarray, int)

_parse_strings(start, count)

Read count characters of raw text starting at line start and split into STRING_BLOCK_SIZE-character blocks, stripping trailing whitespace from each block.

Parameters:
  • start (int) – Starting line index.

  • count (int) – Number of characters to consume.

Returns:

(values, next_line_index)

Return type:

tuple(list of str, int)

_parse_bools(start, count)

Read count boolean values beginning at line start. Each character on a line is decoded as 'T'True or 'F'False.

Parameters:
  • start (int) – Starting line index.

  • count (int) – Number of values to read.

Returns:

(values, next_line_index)

Return type:

tuple(list of bool, int)

_parse_value(typ, start, count)

Dispatch to the appropriate _parse_* helper based on typ.

Parameters:
  • typ (int) – Type code (1 = int, 2 = float, 3 = str, 4 = bool).

  • start (int) – Starting line index.

  • count (int) – Number of values to read.

Returns:

(values, next_line_index)

Return type:

tuple(Any, int)

Raises:

ValueError – If typ is not in {1, 2, 3, 4}.

Usage examples

Parsing a file programmatically

from adfread import AdfParser

parser = AdfParser('TAPE21.asc')
data = parser.parse()

# Access a specific group and variable
atom_coords = data['Geometry']['xyz']   # numpy.ndarray of floats
atom_types  = data['Geometry']['atomtype']

print(atom_coords.reshape(-1, 3))

Writing a human-readable dump

from adfread import AdfParser

parser = AdfParser('TAPE21.asc')
parser.write_dump('TAPE21.txt')

Iterating over all groups and keys

from adfread import AdfParser

parser = AdfParser('TAPE21.asc')
data = parser.parse()

for group, variables in data.items():
    for key, value in variables.items():
        print(f'{group} / {key}: shape={getattr(value, "shape", len(value))}')

Command-line usage

$ python adfread.py TAPE21.asc
# Produces TAPE21.txt

# Error  wrong extension:
$ python adfread.py results.dat
Error: input file must end with .asc

Command-line interface

When invoked as a script the module accepts a single positional argument:

python adfread.py <file.asc>

Condition

Behaviour

No argument supplied

Prints usage and exits with code 1.

File extension is not .asc

Prints an error and exits with code 1.

File not found or is a directory

Prints an error and exits with code 1.

Success

Writes <stem>.txt alongside the input file and exits with code 0.

Dependencies

Package

Usage

numpy

Storage for parsed integer and float arrays.

re

Regular-expression preprocessing of ADF float strings in AdfParser._float_x().

pathlib

Platform-independent path handling.

sys

Reading command-line arguments and controlled process exit.

typing

Type annotations (Any, Dict, List, Union).