Basic Concepts¶
Understanding the core concepts of SMolSAT will help you make the most of its capabilities for molecular dynamics analysis.
🧬 Core Data Structures¶
Coordinate¶
The fundamental 3D coordinate class with periodic boundary condition support.
import smolsat
# Create coordinates
pos1 = smolsat.Coordinate(1.0, 2.0, 3.0)
pos2 = smolsat.Coordinate(4.0, 5.0, 6.0)
# Vector operations
distance = pos1.distance_to(pos2)
dot_product = pos1.dot(pos2)
magnitude = pos1.magnitude()
Key Features: - 3D vector operations (dot product, cross product, magnitude) - Periodic boundary condition calculations - Efficient Eigen3 backend for performance
Particle¶
Represents individual atoms or particles with time-dependent properties.
# Particles are typically created through trajectories
trajectory = smolsat.Trajectory()
particle = trajectory.add_particle(id=0, type=1, mass=1.0, type_name="H")
# Add position data for each frame
for frame in range(100):
pos = smolsat.Coordinate(frame * 0.1, 0.0, 0.0) # Moving particle
particle.add_position(pos)
Properties: - Unique ID and type classification - Mass and type name for analysis - Time series of positions, velocities, and unwrapped positions
Molecule¶
Groups of particles that form molecular entities.
# Create a water molecule (3 particles: H-O-H)
water_particle_ids = [0, 1, 2] # H, O, H
water_molecule = trajectory.add_molecule(water_particle_ids)
# Analyze molecular properties
center_of_mass = water_molecule.center_of_mass(frame=0)
gyration_radius = water_molecule.gyration_radius(frame=0)
Applications: - Polymer chain analysis - Molecular size and shape characterization - Multi-particle correlation studies
Trajectory¶
The main container for all simulation data over time.
trajectory = smolsat.Trajectory()
# Add particles
for i in range(100):
trajectory.add_particle(i, type=1, mass=1.0, type_name="A")
# Add time information
for frame in range(1000):
trajectory.add_time(frame * 0.001) # 1 fs timesteps
trajectory.add_box_size(smolsat.Coordinate(10.0, 10.0, 10.0))
Contains: - All particle data and trajectories - Time information for each frame - Simulation box dimensions - Molecular definitions
System¶
High-level interface for analysis with periodic boundary condition support.
# Create system from trajectory
system = smolsat.System(trajectory, periodic_boundaries=True)
# System provides analysis-ready interface
particles = [trajectory.particle(i) for i in range(50)]
distances = system.calculate_distances(particles, frame=0)
Features: - Periodic boundary condition handling - Efficient distance and displacement calculations - Analysis method integration
🔬 Analysis Methods¶
Mean Square Displacement (MSD)¶
Measures particle mobility and diffusion properties.
# Quick MSD analysis
lag_times, msd_values = smolsat.quick_msd(trajectory)
# Advanced MSD with specific particles
system = smolsat.System(trajectory, periodic_boundaries=True)
selected_particles = trajectory.particles_of_type(1)[:50]
msd_analysis = smolsat.MeanSquareDisplacement(system, selected_particles)
msd_analysis.compute()
Applications: - Diffusion coefficient calculation - Mobility characterization - Transport property analysis
Radius of Gyration (Rg)¶
Characterizes molecular size and compactness.
# System-wide radius of gyration
times, rg_values = smolsat.quick_rg(trajectory)
# Molecular radius of gyration
molecules = [trajectory.molecule(i) for i in range(trajectory.num_molecules())]
rg_analysis = smolsat.RadiusOfGyration(system, molecules)
Applications: - Polymer conformation analysis - Molecular size evolution - Structural compactness studies
📁 Data Loading¶
File Format Support¶
SMolSAT supports various molecular dynamics file formats:
# XYZ format (most common)
trajectory = smolsat.load_xyz("simulation.xyz")
# Using loader directly
loader = smolsat.XYZLoader()
trajectory = loader.load("trajectory_file.xyz")
Data Conversion¶
Convert between SMolSAT and NumPy formats:
# SMolSAT to NumPy
data = smolsat.trajectory_to_numpy(trajectory)
positions = data['positions'] # Shape: (frames, particles, 3)
times = data['times']
# NumPy to SMolSAT
new_trajectory = smolsat.numpy_to_trajectory(
positions=positions,
times=times,
box_sizes=box_sizes
)
🔄 Periodic Boundary Conditions¶
Essential for molecular dynamics simulations with finite simulation boxes.
Concepts¶
# PBC-aware distance calculation
coord1 = smolsat.Coordinate(0.1, 0.1, 0.1)
coord2 = smolsat.Coordinate(9.9, 9.9, 9.9)
box_size = smolsat.Coordinate(10.0, 10.0, 10.0)
# Regular distance (wrong for PBC)
regular_distance = coord1.distance_to(coord2) # ~17.0
# PBC distance (correct)
pbc_distance = coord1.distance_to_pbc(coord2, box_size) # ~0.17
Coordinate Wrapping¶
# Wrap coordinates into primary cell
unwrapped = smolsat.Coordinate(12.5, -3.2, 15.1)
wrapped = unwrapped.wrap_pbc(box_size) # (2.5, 6.8, 5.1)
🎯 Analysis Workflows¶
Typical Analysis Pipeline¶
# 1. Load or create trajectory
trajectory = smolsat.load_trajectory("simulation.xyz")
# 2. Create system with PBC
system = smolsat.System(trajectory, periodic_boundaries=True)
# 3. Select particles/molecules of interest
polymer_particles = trajectory.particles_of_type_name("POL")
solvent_particles = trajectory.particles_of_type_name("SOL")
# 4. Perform analyses
msd_polymer = smolsat.MeanSquareDisplacement(system, polymer_particles)
msd_solvent = smolsat.MeanSquareDisplacement(system, solvent_particles)
# 5. Compute and extract results
msd_polymer.compute()
msd_solvent.compute()
polymer_msd = msd_polymer.msd_values()
solvent_msd = msd_solvent.msd_values()
Batch Analysis¶
# Analyze multiple properties simultaneously
results = smolsat.analyze_trajectory(
trajectory,
analyses=['msd', 'rg', 'diffusion'],
output_dir="analysis_results"
)
🚀 Performance Considerations¶
Memory Management¶
# For large trajectories, use particle selection
total_particles = trajectory.num_particles()
step = max(1, total_particles // 1000) # Analyze ~1000 particles max
selected = [trajectory.particle(i) for i in range(0, total_particles, step)]
Computational Efficiency¶
# Use appropriate lag time ranges for MSD
max_lag = min(trajectory.num_frames() // 4, 1000) # Don't exceed 25% of trajectory
lag_times, msd = smolsat.quick_msd(trajectory, max_lag_time=max_lag)
🛠️ Integration with Scientific Python¶
NumPy Integration¶
import numpy as np
# Seamless conversion
data = smolsat.trajectory_to_numpy(trajectory)
positions = data['positions']
# Use NumPy for custom calculations
center_of_mass = np.mean(positions, axis=1)
distances_from_com = np.linalg.norm(
positions - center_of_mass[:, np.newaxis, :],
axis=2
)
Matplotlib Integration¶
import matplotlib.pyplot as plt
# Built-in plotting functions
smolsat.plot_msd(lag_times, msd_values)
smolsat.plot_time_series(times, rg_values, ylabel="Rg")
# Custom plotting with matplotlib
plt.figure(figsize=(10, 6))
plt.loglog(lag_times, msd_values, 'b-', linewidth=2)
plt.xlabel('Lag Time')
plt.ylabel('MSD')
plt.show()
🎯 Best Practices¶
Data Organization¶
- Consistent Units: Ensure all data uses consistent units (e.g., Angstroms, picoseconds)
- Proper Typing: Use meaningful particle type names for clarity
- Molecular Definitions: Define molecules appropriately for your analysis
Analysis Strategy¶
- Start Simple: Use
quick_*
functions for initial exploration - Validate Results: Check analysis parameters and results for physical reasonableness
- Particle Selection: Focus on relevant particles for computational efficiency
- Statistical Significance: Ensure sufficient sampling for reliable statistics
Performance Optimization¶
- Memory Usage: Monitor memory consumption for large trajectories
- Parallel Processing: SMolSAT uses efficient C++ backends for performance
- Data Preprocessing: Clean and validate data before analysis
Understanding these concepts will help you effectively use SMolSAT for your molecular dynamics analysis needs! 🎓