Main interface for the MLSampling package. Integrates multiple machine learning models (BDL, RF, UDL, UFN) and ensemble strategies for optimizing spatial sampling designs. Provides advanced spatial analysis, design comparison, and uncertainty quantification.
Details
The MLSampling class integrates all constitutional principles:
Code Quality Excellence: Comprehensive error handling and validation
Spatial Analysis Excellence: Modern terra/sf usage with CRS validation
Testing Standards: 90%+ test coverage with TDD approach
User Experience Consistency: Consistent APIs and progress feedback
Performance Excellence: Memory efficiency and parallel support
It combines the core capabilities of the legacy SoilSamplingTool with advanced ML modules:
Bayesian Deep Learning (BDL) for uncertainty quantification
Random Forest (RF) for feature importance-based optimization
Ensemble methods (Voting, Stacking)
Unified Deep Learning (UDL) & Unified Feature Network (UFN)
Public fields
config_managerConfiguration manager instance
validation_serviceSpatial data validation service
benchmarking_servicePerformance benchmarking service
progress_managerProgress tracking manager
resource_managerResource management manager
supported_algorithmsSupported optimization algorithms
constitutional_complianceConstitutional compliance tracker
bdl_moduleBayesian Deep Learning module instance
rf_moduleRandom Forest module instance
ensemble_managerEnsemble manager instance
spatial_engineSpatial analysis engine instance
comparison_engineDesign comparison framework instance
reporting_serviceReporting service instance
visualization_serviceVisualization service instance Initialize MLSampling Tool
Methods
Method new()
Usage
MLSampling$new(config = NULL, config_manager = NULL, validate_system = FALSE)Method run_udl()
Usage
MLSampling$run_udl(
field_data = NULL,
existing_samples = NULL,
n_new_samples,
optimization_method = "greedy",
model_config = NULL,
parallel = FALSE,
max_iter = NULL,
save_csv = FALSE,
...
)Arguments
field_dataList containing boundary, covariates, and metadata
existing_samplesOptional data frame with existing sample locations
n_new_samplesNumber of new samples to select
optimization_methodOptimization algorithm to use
model_configOptional model configuration
parallelWhether to use parallel processing
Method run_ufn()
Usage
MLSampling$run_ufn(
field_data = NULL,
existing_samples = NULL,
n_new_samples,
model_config = NULL,
force_neural_network = FALSE,
force_statistical_fallback = FALSE,
...
)Arguments
field_dataList containing boundary, covariates, and metadata
existing_samplesOptional data frame with existing sample locations
n_new_samplesNumber of new samples to select
model_configUFN model configuration
force_neural_networkForce use of neural network (requires torch)
force_statistical_fallbackForce use of statistical fallback
Method run_bdl()
Usage
MLSampling$run_bdl(
field_data,
existing_samples,
n_new_samples,
uncertainty_type = "total",
mc_iterations = 100,
constitutional_compliance = TRUE,
save_csv = FALSE
)Method run_rf_optimization()
Usage
MLSampling$run_rf_optimization(
field_data,
existing_samples,
n_new_samples,
feature_importance_method = "permutation",
spatial_autocorr = TRUE,
constitutional_compliance = TRUE,
save_csv = FALSE
)Method run_ensemble()
Usage
MLSampling$run_ensemble(
field_data,
existing_samples,
n_new_samples,
methods = c("BDL", "RF"),
ensemble_method = "voting",
constitutional_compliance = TRUE
)Method compare_designs()
Method compare_models()
Usage
MLSampling$compare_models(
field_data = NULL,
existing_samples = NULL,
n_new_samples,
algorithms = self$supported_algorithms[1:3],
n_iterations = 5,
parallel = FALSE,
confidence_level = 0.95,
...
)Arguments
field_dataList containing boundary, covariates, and metadata
existing_samplesOptional data frame with existing sample locations
n_new_samplesNumber of new samples to select
algorithmsVector of algorithms to compare
n_iterationsNumber of iterations for statistical comparison
parallelWhether to use parallel processing
confidence_levelConfidence level for statistical tests
Method quantify_uncertainty()
Usage
MLSampling$quantify_uncertainty(
predictions,
method = "ensemble",
uncertainty_type = "total"
)Method generate_ml_report()
Usage
MLSampling$generate_ml_report(
result,
report_type = "comprehensive",
include_uncertainty_analysis = TRUE,
include_visualizations = TRUE,
constitutional_compliance = TRUE,
output_dir = getwd()
)Method generate_report()
Usage
MLSampling$generate_report(
optimization_result = NULL,
output_format = "html",
report_config = NULL,
export_path = NULL
)Method save_coordinates_to_csv()
Usage
MLSampling$save_coordinates_to_csv(
optimization_result = NULL,
file_path,
output_crs = NULL,
include_metadata = FALSE,
include_crs_info = FALSE,
decimal_places = 6,
coordinate_format = "decimal",
column_names = NULL,
include_fields = NULL,
include_covariate_values = FALSE,
validate_export = TRUE,
constitutional_compliance = FALSE,
quality_assurance = FALSE,
standard_format = TRUE
)Arguments
optimization_resultResult from optimization
file_pathPath for CSV export
output_crsTarget coordinate system for export
include_metadataWhether to include metadata
include_crs_infoWhether to include CRS information
decimal_placesNumber of decimal places for coordinates
coordinate_formatCoordinate format option
column_namesCustom column names
include_fieldsAdditional fields to include
include_covariate_valuesWhether to include covariate values
validate_exportWhether to validate exported data
constitutional_complianceWhether to include constitutional compliance info
quality_assuranceWhether to perform quality checks
standard_formatWhether to use standardized format
