Skip to content

Basic Concepts

Understanding the core concepts of SMolSAT will help you make the most of its capabilities for molecular dynamics analysis.

🧬 Core Data Structures

Coordinate

The fundamental 3D coordinate class with periodic boundary condition support.

import smolsat

# Create coordinates
pos1 = smolsat.Coordinate(1.0, 2.0, 3.0)
pos2 = smolsat.Coordinate(4.0, 5.0, 6.0)

# Vector operations
distance = pos1.distance_to(pos2)
dot_product = pos1.dot(pos2)
magnitude = pos1.magnitude()

Key Features: - 3D vector operations (dot product, cross product, magnitude) - Periodic boundary condition calculations - Efficient Eigen3 backend for performance

Particle

Represents individual atoms or particles with time-dependent properties.

# Particles are typically created through trajectories
trajectory = smolsat.Trajectory()
particle = trajectory.add_particle(id=0, type=1, mass=1.0, type_name="H")

# Add position data for each frame
for frame in range(100):
    pos = smolsat.Coordinate(frame * 0.1, 0.0, 0.0)  # Moving particle
    particle.add_position(pos)

Properties: - Unique ID and type classification - Mass and type name for analysis - Time series of positions, velocities, and unwrapped positions

Molecule

Groups of particles that form molecular entities.

# Create a water molecule (3 particles: H-O-H)
water_particle_ids = [0, 1, 2]  # H, O, H
water_molecule = trajectory.add_molecule(water_particle_ids)

# Analyze molecular properties
center_of_mass = water_molecule.center_of_mass(frame=0)
gyration_radius = water_molecule.gyration_radius(frame=0)

Applications: - Polymer chain analysis - Molecular size and shape characterization - Multi-particle correlation studies

Trajectory

The main container for all simulation data over time.

trajectory = smolsat.Trajectory()

# Add particles
for i in range(100):
    trajectory.add_particle(i, type=1, mass=1.0, type_name="A")

# Add time information
for frame in range(1000):
    trajectory.add_time(frame * 0.001)  # 1 fs timesteps
    trajectory.add_box_size(smolsat.Coordinate(10.0, 10.0, 10.0))

Contains: - All particle data and trajectories - Time information for each frame - Simulation box dimensions - Molecular definitions

System

High-level interface for analysis with periodic boundary condition support.

# Create system from trajectory
system = smolsat.System(trajectory, periodic_boundaries=True)

# System provides analysis-ready interface
particles = [trajectory.particle(i) for i in range(50)]
distances = system.calculate_distances(particles, frame=0)

Features: - Periodic boundary condition handling - Efficient distance and displacement calculations - Analysis method integration

🔬 Analysis Methods

Mean Square Displacement (MSD)

Measures particle mobility and diffusion properties.

# Quick MSD analysis
lag_times, msd_values = smolsat.quick_msd(trajectory)

# Advanced MSD with specific particles
system = smolsat.System(trajectory, periodic_boundaries=True)
selected_particles = trajectory.particles_of_type(1)[:50]
msd_analysis = smolsat.MeanSquareDisplacement(system, selected_particles)
msd_analysis.compute()

Applications: - Diffusion coefficient calculation - Mobility characterization - Transport property analysis

Radius of Gyration (Rg)

Characterizes molecular size and compactness.

# System-wide radius of gyration
times, rg_values = smolsat.quick_rg(trajectory)

# Molecular radius of gyration
molecules = [trajectory.molecule(i) for i in range(trajectory.num_molecules())]
rg_analysis = smolsat.RadiusOfGyration(system, molecules)

Applications: - Polymer conformation analysis - Molecular size evolution - Structural compactness studies

📁 Data Loading

File Format Support

SMolSAT supports various molecular dynamics file formats:

# XYZ format (most common)
trajectory = smolsat.load_xyz("simulation.xyz")

# Using loader directly
loader = smolsat.XYZLoader()
trajectory = loader.load("trajectory_file.xyz")

Data Conversion

Convert between SMolSAT and NumPy formats:

# SMolSAT to NumPy
data = smolsat.trajectory_to_numpy(trajectory)
positions = data['positions']  # Shape: (frames, particles, 3)
times = data['times']

# NumPy to SMolSAT
new_trajectory = smolsat.numpy_to_trajectory(
    positions=positions,
    times=times,
    box_sizes=box_sizes
)

🔄 Periodic Boundary Conditions

Essential for molecular dynamics simulations with finite simulation boxes.

Concepts

# PBC-aware distance calculation
coord1 = smolsat.Coordinate(0.1, 0.1, 0.1)
coord2 = smolsat.Coordinate(9.9, 9.9, 9.9)
box_size = smolsat.Coordinate(10.0, 10.0, 10.0)

# Regular distance (wrong for PBC)
regular_distance = coord1.distance_to(coord2)  # ~17.0

# PBC distance (correct)
pbc_distance = coord1.distance_to_pbc(coord2, box_size)  # ~0.17

Coordinate Wrapping

# Wrap coordinates into primary cell
unwrapped = smolsat.Coordinate(12.5, -3.2, 15.1)
wrapped = unwrapped.wrap_pbc(box_size)  # (2.5, 6.8, 5.1)

🎯 Analysis Workflows

Typical Analysis Pipeline

# 1. Load or create trajectory
trajectory = smolsat.load_trajectory("simulation.xyz")

# 2. Create system with PBC
system = smolsat.System(trajectory, periodic_boundaries=True)

# 3. Select particles/molecules of interest
polymer_particles = trajectory.particles_of_type_name("POL")
solvent_particles = trajectory.particles_of_type_name("SOL")

# 4. Perform analyses
msd_polymer = smolsat.MeanSquareDisplacement(system, polymer_particles)
msd_solvent = smolsat.MeanSquareDisplacement(system, solvent_particles)

# 5. Compute and extract results
msd_polymer.compute()
msd_solvent.compute()

polymer_msd = msd_polymer.msd_values()
solvent_msd = msd_solvent.msd_values()

Batch Analysis

# Analyze multiple properties simultaneously
results = smolsat.analyze_trajectory(
    trajectory,
    analyses=['msd', 'rg', 'diffusion'],
    output_dir="analysis_results"
)

🚀 Performance Considerations

Memory Management

# For large trajectories, use particle selection
total_particles = trajectory.num_particles()
step = max(1, total_particles // 1000)  # Analyze ~1000 particles max
selected = [trajectory.particle(i) for i in range(0, total_particles, step)]

Computational Efficiency

# Use appropriate lag time ranges for MSD
max_lag = min(trajectory.num_frames() // 4, 1000)  # Don't exceed 25% of trajectory
lag_times, msd = smolsat.quick_msd(trajectory, max_lag_time=max_lag)

🛠️ Integration with Scientific Python

NumPy Integration

import numpy as np

# Seamless conversion
data = smolsat.trajectory_to_numpy(trajectory)
positions = data['positions']

# Use NumPy for custom calculations
center_of_mass = np.mean(positions, axis=1)
distances_from_com = np.linalg.norm(
    positions - center_of_mass[:, np.newaxis, :], 
    axis=2
)

Matplotlib Integration

import matplotlib.pyplot as plt

# Built-in plotting functions
smolsat.plot_msd(lag_times, msd_values)
smolsat.plot_time_series(times, rg_values, ylabel="Rg")

# Custom plotting with matplotlib
plt.figure(figsize=(10, 6))
plt.loglog(lag_times, msd_values, 'b-', linewidth=2)
plt.xlabel('Lag Time')
plt.ylabel('MSD')
plt.show()

🎯 Best Practices

Data Organization

  1. Consistent Units: Ensure all data uses consistent units (e.g., Angstroms, picoseconds)
  2. Proper Typing: Use meaningful particle type names for clarity
  3. Molecular Definitions: Define molecules appropriately for your analysis

Analysis Strategy

  1. Start Simple: Use quick_* functions for initial exploration
  2. Validate Results: Check analysis parameters and results for physical reasonableness
  3. Particle Selection: Focus on relevant particles for computational efficiency
  4. Statistical Significance: Ensure sufficient sampling for reliable statistics

Performance Optimization

  1. Memory Usage: Monitor memory consumption for large trajectories
  2. Parallel Processing: SMolSAT uses efficient C++ backends for performance
  3. Data Preprocessing: Clean and validate data before analysis

Understanding these concepts will help you effectively use SMolSAT for your molecular dynamics analysis needs! 🎓