jittor_geometric.data

Data structures and utilities.

Author: lusz Date: 2025-01-14 00:47:42 Description:

class jittor_geometric.data.Data(x=None, edge_index=None, edge_attr=None, y=None, pos=None, normal=None, face=None, column_indices=None, row_offset=None, csr_edge_weight=None, row_indices=None, column_offset=None, csc_edge_weight=None, **kwargs)[source]

Bases: object

A plain old python object modeling a single graph with various (optional) attributes:

Parameters:
  • x (Var, optional) – Node feature matrix with shape [num_nodes, num_node_features]. (default: None)

  • edge_index (Var.int32, optional) – Graph connectivity in COO format with shape [2, num_edges]. (default: None)

  • edge_attr (Var, optional) – Edge feature matrix with shape [num_edges, num_edge_features]. (default: None)

  • y (Var, optional) – Graph or node targets with arbitrary shape. (default: None)

  • pos (Var, optional) – Node position matrix with shape [num_nodes, num_dimensions]. (default: None)

  • normal (Var, optional) – Normal vector matrix with shape [num_nodes, num_dimensions]. (default: None)

  • face (Var.int32, optional) – Face adjacency matrix with shape [3, num_faces]. (default: None)

The data object is not restricted to these attributes and can be extented by any other additional data.

Example:

data = Data(x=x, edge_index=edge_index)
data.train_idx = jt.array([...], dtype=Var.int32)
data.test_mask = jt.array([...], dtype=Var.bool)
__init__(x=None, edge_index=None, edge_attr=None, y=None, pos=None, normal=None, face=None, column_indices=None, row_offset=None, csr_edge_weight=None, row_indices=None, column_offset=None, csc_edge_weight=None, **kwargs)[source]
classmethod from_dict(dictionary)[source]

Creates a data object from a python dictionary.

to_dict()[source]
to_namedtuple()[source]
__getitem__(key)[source]

Gets the data of the attribute key.

__setitem__(key, value)[source]

Sets the attribute key to value.

property keys

Returns all names of graph attributes.

__len__()[source]

Returns the number of all present attributes.

__contains__(key)[source]

Returns True, if the attribute key is present in the data.

__iter__()[source]

Iterates over all present attributes in the data, yielding their attribute names and content.

__call__(*keys)[source]

Iterates over all attributes *keys in the data, yielding their attribute names and content. If *keys is not given this method will iterative over all present attributes.

__cat_dim__(key, value)[source]

Returns the dimension for which value of attribute key will get concatenated when creating batches.

Note

This method is for internal use only, and should only be overridden if the batch concatenation process is corrupted for a specific data attribute.

__inc__(key, value)[source]

Returns the incremental count to cumulatively increase the value of the next attribute of key when creating batches.

Note

This method is for internal use only, and should only be overridden if the batch concatenation process is corrupted for a specific data attribute.

property num_nodes

Returns or sets the number of nodes in the graph.

Note

The number of nodes in your data object is typically automatically inferred, e.g., when node features x are present. In some cases however, a graph may only be given by its edge indices edge_index. Jittor Geometric then guesses the number of nodes according to edge_index.max().item() + 1, but in case there exists isolated nodes, this number has not to be correct and can therefore result in unexpected batch-wise behavior. Thus, we recommend to set the number of nodes in your data object explicitly via data.num_nodes = .... You will be given a warning that requests you to do so.

property num_edges

Returns the number of edges in the graph. For undirected graphs, this will return the number of bi-directional edges, which is double the amount of unique edges.

property num_faces

Returns the number of faces in the mesh.

property num_node_features

Returns the number of features per node in the graph.

property num_features

Alias for num_node_features.

property num_edge_features

Returns the number of features per edge in the graph.

contains_isolated_nodes()[source]

Returns True, if the graph contains isolated nodes.

contains_self_loops()[source]

Returns True, if the graph contains self-loops.

apply(func, *keys)[source]

Applies the function func to all Var attributes *keys. If *keys is not given, func is applied to all present attributes.

contiguous(*keys)[source]

Ensures a contiguous memory layout for all attributes *keys. If *keys is not given, all present attributes are ensured to have a contiguous memory layout.

cpu(*keys)[source]

Copies all attributes *keys to CPU memory. If *keys is not given, the conversion is applied to all present attributes.

cuda(device=None, non_blocking=False, *keys)[source]

Copies all attributes *keys to CUDA memory. If *keys is not given, the conversion is applied to all present attributes.

clone()[source]

Performs a deep-copy of the data object.

pin_memory(*keys)[source]

Copies all attributes *keys to pinned memory. If *keys is not given, the conversion is applied to all present attributes.

debug()[source]
class jittor_geometric.data.Dataset(root=None, transform=None, pre_transform=None, pre_filter=None)[source]

Bases: Dataset

Parameters:
  • root (string, optional) – Root directory where the dataset should be saved. (optional: None)

  • transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an jittor_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • pre_filter (callable, optional) – A function that takes in an jittor_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default: None)

property raw_file_names

The name of the files to find in the self.raw_dir folder in order to skip the download.

property processed_file_names

The name of the files to find in the self.processed_dir folder in order to skip the processing.

download()[source]

Downloads the dataset to the self.raw_dir folder.

process()[source]

Processes the dataset to the self.processed_dir folder.

len()[source]
get(idx)[source]

Gets the data object at index idx.

__init__(root=None, transform=None, pre_transform=None, pre_filter=None)[source]
indices()[source]
property raw_dir
property processed_dir
property num_node_features

Returns the number of features per node in the dataset.

property num_features

Alias for num_node_features.

property num_edge_features

Returns the number of features per edge in the dataset.

property raw_paths

The filepaths to find in order to skip the download.

property processed_paths

The filepaths to find in the self.processed_dir folder in order to skip the processing.

__len__()[source]

The number of examples in the dataset.

__getitem__(idx)[source]

Gets the data object at index idx and transforms it (in case a self.transform is given). In case idx is a slicing object, e.g., [2:5], a list, a tuple, a Var int32 or a Var bool, will return a subset of the dataset at the specified indices.

index_select(idx)[source]
shuffle(return_perm=False)[source]

Randomly shuffles the examples in the dataset.

Parameters:

return_perm (bool, optional) – If set to True, will additionally return the random permutation used to shuffle the dataset. (default: False)

class jittor_geometric.data.InMemoryDataset(root=None, transform=None, pre_transform=None, pre_filter=None)[source]

Bases: Dataset

Dataset base class for creating graph datasets which fit completely into CPU memory.

Parameters:
  • root (string, optional) – Root directory where the dataset should be saved. (default: None)

  • transform (callable, optional) – A function/transform that takes in an jittor_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an jittor_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • pre_filter (callable, optional) – A function that takes in an jittor_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default: None)

property raw_file_names

The name of the files to find in the self.raw_dir folder in order to skip the download.

property processed_file_names

The name of the files to find in the self.processed_dir folder in order to skip the processing.

download()[source]

Downloads the dataset to the self.raw_dir folder.

process()[source]

Processes the dataset to the self.processed_dir folder.

__init__(root=None, transform=None, pre_transform=None, pre_filter=None)[source]
property num_classes

The number of classes in the dataset.

len()[source]
get(idx)[source]

Gets the data object at index idx.

static collate(data_list)[source]

Collates a python list of data objects to the internal storage format of torch_geometric.data.InMemoryDataset.

copy(idx=None)[source]
jittor_geometric.data.download_url(url, folder, log=True)[source]

Downloads the content of an URL to a specific folder.

Parameters:
  • url (string) – The url.

  • folder (string) – The folder.

  • log (bool, optional) – If False, will not print anything to the console. (default: True)

jittor_geometric.data.decide_download(url)[source]
jittor_geometric.data.extract_zip(path, folder, log=True)[source]

Extracts a zip archive to a specific folder.

Parameters:
  • path (string) – The path to the tar archive.

  • folder (string) – The folder.

  • log (bool, optional) – If False, will not print anything to the console. (default: True)

jittor_geometric.data.extract_gz(path, folder, log=True)[source]

Extracts a gz archive to a specific folder.

Parameters:
  • path (str) – The path to the tar archive.

  • folder (str) – The folder.

  • log (bool, optional) – If False, will not print anything to the console. (default: True)

Return type:

None

jittor_geometric.data.extract_tar(path, folder, mode='r:gz', log=True)[source]

Extracts a tar archive to a specific folder.

Parameters:
  • path (str) – The path to the tar archive.

  • folder (str) – The folder.

  • mode (str, optional) – The compression mode. (default: "r:gz")

  • log (bool, optional) – If False, will not print anything to the console. (default: True)

Return type:

None

class jittor_geometric.data.CSC(row_indices=None, column_offset=None, edge_weight=None)[source]

Bases: object

__init__(row_indices=None, column_offset=None, edge_weight=None)[source]
class jittor_geometric.data.CSR(column_indices=None, row_offset=None, edge_weight=None)[source]

Bases: object

__init__(column_indices=None, row_offset=None, edge_weight=None)[source]
class jittor_geometric.data.GraphChunk(chunks, chunk_id, v_num, global_v_num, local_mask=None, local_feature=None, local_label=None)[source]

Bases: object

Parameters:
__init__(chunks, chunk_id, v_num, global_v_num, local_mask=None, local_feature=None, local_label=None)[source]
Parameters:
set_csr(column_indices, row_offset, edge_weight=None)[source]

Set the CSR (Compressed Sparse Row) representation of the graph. :type column_indices: :param column_indices: Column indices of the non-zero elements. :type row_offset: :param row_offset: Row offsets for the CSR format. :type edge_weight: :param edge_weight: Optional edge weights.

save(file_path)[source]

Save the GraphChunk instance as a binary file. :type file_path: str :param file_path: Path to the file where the instance will be saved.

static load(file_path)[source]

Load a GraphChunk instance from a binary file. :type file_path: str :param file_path: Path to the file from which the instance will be loaded. :return: Loaded GraphChunk instance.

class jittor_geometric.data.TemporalData(src=None, dst=None, t=None, msg=None, y=None, edge_ids=None, **kwargs)[source]

Bases: object

__init__(src=None, dst=None, t=None, msg=None, y=None, edge_ids=None, **kwargs)[source]
__setitem__(key, value)[source]

Sets the attribute key to value.

property keys
property num_nodes
property max_node_id
property src_size
property dst_size
property num_events
property num_edges
apply(func, *keys)[source]

Applies the function func to all Var attributes *keys. If *keys is not given, func is applied to all present attributes.

to(device, *keys, **kwargs)[source]
train_val_test_split(val_ratio=0.15, test_ratio=0.15)[source]
train_val_test_split_w_mask()[source]
seq_batches(batch_size)[source]
class jittor_geometric.data.Batch(x=None, edge_index=None, edge_attr=None, y=None, pos=None, normal=None, face=None, column_indices=None, row_offset=None, csr_edge_weight=None, row_indices=None, column_offset=None, csc_edge_weight=None, **kwargs)[source]

Bases: Data

A data object describing a batch of graphs as one big (disconnected) graph. Inherits from jittor_geometric.data.Data.

classmethod from_data_list(data_list, follow_batch=None, exclude_keys=None)[source]

Constructs a Batch object from a list of Data objects. The assignment vector batch is created on the fly. In addition, creates assignment vectors for each key in follow_batch. Will exclude any keys given in exclude_keys.

Parameters:
Return type:

Batch

get_example(idx)[source]

Gets the Data object at index idx. The Batch object must have been created via from_data_list() in order to be able to reconstruct the initial object.

Parameters:

idx (int)

Return type:

Data

to_data_list()[source]

Reconstructs the list of Data objects from the Batch object.

Return type:

List[Data]

property num_graphs: int

Returns the number of graphs in the batch.

property batch_size: int

Alias for num_graphs.

class jittor_geometric.data.Dictionary(*, bos='[CLS]', pad='[PAD]', eos='[SEP]', unk='[UNK]', extra_special_symbols=None)[source]

Bases: object

A mapping from symbols to consecutive integers

__init__(*, bos='[CLS]', pad='[PAD]', eos='[SEP]', unk='[UNK]', extra_special_symbols=None)[source]
__len__()[source]

Returns the number of symbols in the dictionary

vec_index(a)[source]
index(sym)[source]

Returns the index of the specified symbol

special_index()[source]
add_symbol(word, n=1, overwrite=False, is_special=False)[source]

Adds a word to the dictionary

bos()[source]

Helper to get index of beginning-of-sentence symbol

pad()[source]

Helper to get index of pad symbol

eos()[source]

Helper to get index of end-of-sentence symbol

unk()[source]

Helper to get index of unk symbol

classmethod load(f)[source]

Loads the dictionary from a text file with the format:

` <symbol0> <count0> <symbol1> <count1> ... `

add_from_file(f)[source]

Loads pre-defined dictionary symbols. If f == “default”, it will load the default atom dictionary. Otherwise, loads from a text file and adds its symbols to this instance.

class jittor_geometric.data.ConformerGen(**params)[source]

Bases: object

This class designed to generate conformers for molecules represented as SMILES strings using provided parameters and configurations. The transform method uses multiprocessing to speed up the conformer generation process.

__init__(**params)[source]

Initializes the neural network model based on the provided model name and parameters.

Parameters:
  • model_name – (str) The name of the model to initialize.

  • params – Additional parameters for model configuration.

Returns:

An instance of the specified neural network model.

Raises:

ValueError – If the model name is not recognized.

transform_raw(atoms_list, coordinates_list)[source]

Transforms raw atomic data and coordinates into Uni-Mol input.

Parameters:
  • atoms_list – List of atomic symbols.

  • coordinates_list – List of atomic coordinates.

Returns:

List of Uni-Mol input.