Training set augmentation and biology-aware harmonization improve radiomic models for lung cancer prediction in indeterminate nodules