EI-HPC - Enabling Infrastructure for HPC-Applications (DE/BMBF/13FH156IN6)
Refine
H-BRS Bibliography
- yes (23)
Departments, institutes and facilities
Document Type
- Article (13)
- Preprint (6)
- Part of a Book (1)
- Conference Object (1)
- Doctoral Thesis (1)
- Report (1)
Keywords
- Quality diversity (3)
- AML (2)
- Big Data Analysis (2)
- HSP90 (2)
- Adams-Moulton (1)
- Authentication features (1)
- BDF (1)
- Basis set (1)
- Bayesian optimization (1)
- Clustering (1)
Force field (FF) based molecular modeling is an often used method to investigate and study structural and dynamic properties of (bio-)chemical substances and systems. When such a system is modeled or refined, the force field parameters need to be adjusted. This force field parameter optimization can be a tedious task and is always a trade-off in terms of errors regarding the targeted properties. To better control the balance of various properties’ errors, in this study we introduce weighting factors for the optimization objectives. Different weighting strategies are compared to fine-tune the balance between bulk-phase density and relative conformational energies (RCE), using n-octane as a representative system. Additionally, a non-linear projection of the individual property-specific parts of the optimized loss function is deployed to further improve the balance between them. The results show that the overall error is reduced. One interesting outcome is a large variety in the resulting optimized force field parameters (FFParams) and corresponding errors, suggesting that the optimization landscape is multi-modal and very dependent on the weighting factor setup. We conclude that adjusting the weighting factors can be a very important feature to lower the overall error in the FF optimization procedure, giving researchers the possibility to fine-tune their FFs.
The representation, or encoding, utilized in evolutionary algorithms has a substantial effect on their performance. Examination of the suitability of widely used representations for quality diversity optimization (QD) in robotic domains has yielded inconsistent results regarding the most appropriate encoding method. Given the domain-dependent nature of QD, additional evidence from other domains is necessary. This study compares the impact of several representations, including direct encoding, a dictionary-based representation, parametric encoding, compositional pattern producing networks, and cellular automata, on the generation of voxelized meshes in an architecture setting. The results reveal that some indirect encodings outperform direct encodings and can generate more diverse solution sets, especially when considering full phenotypic diversity. The paper introduces a multi-encoding QD approach that incorporates all evaluated representations in the same archive. Species of encodings compete on the basis of phenotypic features, leading to an approach that demonstrates similar performance to the best single-encoding QD approach. This is noteworthy, as it does not always require the contribution of the best-performing single encoding.
Quality diversity algorithms can be used to efficiently create a diverse set of solutions to inform engineers' intuition. But quality diversity is not efficient in very expensive problems, needing 100.000s of evaluations. Even with the assistance of surrogate models, quality diversity needs 100s or even 1000s of evaluations, which can make it use infeasible. In this study we try to tackle this problem by using a pre-optimization strategy on a lower-dimensional optimization problem and then map the solutions to a higher-dimensional case. For a use case to design buildings that minimize wind nuisance, we show that we can predict flow features around 3D buildings from 2D flow features around building footprints. For a diverse set of building designs, by sampling the space of 2D footprints with a quality diversity algorithm, a predictive model can be trained that is more accurate than when trained on a set of footprints that were selected with a space-filling algorithm like the Sobol sequence. Simulating only 16 buildings in 3D, a set of 1024 building designs with low predicted wind nuisance is created. We show that we can produce better machine learning models by producing training data with quality diversity instead of using common sampling techniques. The method can bootstrap generative design in a computationally expensive 3D domain and allow engineers to sweep the design space, understanding wind nuisance in early design phases.
Risk-based authentication (RBA) aims to protect users against attacks involving stolen passwords. RBA monitors features during login, and requests re-authentication when feature values widely differ from those previously observed. It is recommended by various national security organizations, and users perceive it more usable than and equally secure to equivalent two-factor authentication. Despite that, RBA is still used by very few online services. Reasons for this include a lack of validated open resources on RBA properties, implementation, and configuration. This effectively hinders the RBA research, development, and adoption progress.
To close this gap, we provide the first long-term RBA analysis on a real-world large-scale online service. We collected feature data of 3.3 million users and 31.3 million login attempts over more than 1 year. Based on the data, we provide (i) studies on RBA’s real-world characteristics plus its configurations and enhancements to balance usability, security, and privacy; (ii) a machine learning–based RBA parameter optimization method to support administrators finding an optimal configuration for their own use case scenario; (iii) an evaluation of the round-trip time feature’s potential to replace the IP address for enhanced user privacy; and (iv) a synthesized RBA dataset to reproduce this research and to foster future RBA research. Our results provide insights on selecting an optimized RBA configuration so that users profit from RBA after just a few logins. The open dataset enables researchers to study, test, and improve RBA for widespread deployment in the wild.
In this thesis it is posed that the central object of preference discovery is a co-creative process in which the Other can be represented by a machine. It explores efficient methods to enhance introverted intuition using extraverted intuition's communication lines. Possible implementations of such processes are presented using novel algorithms that perform divergent search to feed the users' intuition with many examples of high quality solutions, allowing them to take influence interactively. The machine feeds and reflects upon human intuition, combining both what is possible and preferred. The machine model and the divergent optimization algorithms are the motor behind this co-creative process, in which machine and users co-create and interactively choose branches of an ad hoc hierarchical decomposition of the solution space.
The proposed co-creative process consists of several elements: a formal model for interactive co-creative processes, evolutionary divergent search, diversity and similarity, data-driven methods to discover diversity, limitations of artificial creative agents, matters of efficiency in behavioral and morphological modeling, visualization, a connection to prototype theory, and methods to allow users to influence artificial creative agents. This thesis helps putting the human back into the design loop in generative AI and optimization.
Mebendazole Mediates Proteasomal Degradation of GLI Transcription Factors in Acute Myeloid Leukemia
(2021)
The prognosis of elderly AML patients is still poor due to chemotherapy resistance. The Hedgehog (HH) pathway is important for leukemic transformation because of aberrant activation of GLI transcription factors. MBZ is a well-tolerated anthelmintic that exhibits strong antitumor effects. Herein, we show that MBZ induced strong, dose-dependent anti-leukemic effects on AML cells, including the sensitization of AML cells to chemotherapy with cytarabine. MBZ strongly reduced intracellular protein levels of GLI1/GLI2 transcription factors. Consequently, MBZ reduced the GLI promoter activity as observed in luciferase-based reporter assays in AML cell lines. Further analysis revealed that MBZ mediates its anti-leukemic effects by promoting the proteasomal degradation of GLI transcription factors via inhibition of HSP70/90 chaperone activity. Extensive molecular dynamics simulations were performed on the MBZ-HSP90 complex, showing a stable binding interaction at the ATP binding site. Importantly, two patients with refractory AML were treated with MBZ in an off-label setting and MBZ effectively reduced the GLI signaling activity in a modified plasma inhibitory assay, resulting in a decrease in peripheral blood blast counts in one patient. Our data prove that MBZ is an effective GLI inhibitor that should be evaluated in combination to conventional chemotherapy in the clinical setting.
Recent experimental evidence suggest that mebendazole, a popular antiparasitic drug, binds to heat shock protein 90 (Hsp90) and inhibit acute myeloid leukemia cell growth. In this study we use quantum mechanics (QM), molecular similarity and molecular dynamics (MD) calculations to predict possible binding poses of mebendazole to the adenosine triphosphate (ATP) binding site of Hsp90. Extensive conformational searches and minimization of the five tautomers of mebendazole using MP2/aug-cc-pVTZ theory level resulting in 152 minima being identified. Mebendazole-Hsp90 complex models were created using the QM optimized conformations and protein coordinates obtained from experimental crystal structures that were chosen through similarity calculations. Nine different poses were identified from a total of 600 ns of explicit solvent, all-atom MD simulations using two different force fields. All simulations support the hypothesis that mebendazole is able to bind to the ATP binding site of Hsp90.
Turbulent compressible flows are traditionally simulated using explicit time integrators applied to discretized versions of the Navier-Stokes equations. However, the associated Courant-Friedrichs-Lewy condition severely restricts the maximum time-step size. Exploiting the Lagrangian nature of the Boltzmann equation’s material derivative, we now introduce a feasible three-dimensional semi-Lagrangian lattice Boltzmann method (SLLBM), which circumvents this restriction. While many lattice Boltzmann methods for compressible flows were restricted to two dimensions due to the enormous number of discrete velocities in three dimensions, the SLLBM uses only 45 discrete velocities. Based on compressible Taylor-Green vortex simulations we show that the new method accurately captures shocks or shocklets as well as turbulence in 3D without utilizing additional filtering or stabilizing techniques other than the filtering introduced by the interpolation, even when the time-step sizes are up to two orders of magnitude larger compared to simulations in the literature. Our new method therefore enables researchers to study compressible turbulent flows by a fully explicit scheme, whose range of admissible time-step sizes is dictated by physics rather than spatial discretization.
Off-lattice Boltzmann methods increase the flexibility and applicability of lattice Boltzmann methods by decoupling the discretizations of time, space, and particle velocities. However, the velocity sets that are mostly used in off-lattice Boltzmann simulations were originally tailored to on-lattice Boltzmann methods. In this contribution, we show how the accuracy and efficiency of weakly and fully compressible semi-Lagrangian off-lattice Boltzmann simulations is increased by velocity sets derived from cubature rules, i.e. multivariate quadratures, which have not been produced by the Gauß-product rule. In particular, simulations of 2D shock-vortex interactions indicate that the cubature-derived degree-nine D2Q19 velocity set is capable to replace the Gauß-product rule-derived D2Q25. Likewise, the degree-five velocity sets D3Q13 and D3Q21, as well as a degree-seven D3V27 velocity set were successfully tested for 3D Taylor–Green vortex flows to challenge and surpass the quality of the customary D3Q27 velocity set. In compressible 3D Taylor–Green vortex flows with Mach numbers on-lattice simulations with velocity sets D3Q103 and D3V107 showed only limited stability, while the off-lattice degree-nine D3Q45 velocity set accurately reproduced the kinetic energy provided by literature.
Risk-based authentication (RBA) aims to strengthen password-based authentication rather than replacing it. RBA does this by monitoring and recording additional features during the login process. If feature values at login time differ significantly from those observed before, RBA requests an additional proof of identification. Although RBA is recommended in the NIST digital identity guidelines, it has so far been used almost exclusively by major online services. This is partly due to a lack of open knowledge and implementations that would allow any service provider to roll out RBA protection to its users.
To close this gap, we provide a first in-depth analysis of RBA characteristics in a practical deployment. We observed N=780 users with 247 unique features on a real-world online service for over 1.8 years. Based on our collected data set, we provide (i) a behavior analysis of two RBA implementations that were apparently used by major online services in the wild, (ii) a benchmark of the features to extract a subset that is most suitable for RBA use, (iii) a new feature that has not been used in RBA before, and (iv) factors which have a significant effect on RBA performance. Our results show that RBA needs to be carefully tailored to each online service, as even small configuration adjustments can greatly impact RBA's security and usability properties. We provide insights on the selection of features, their weightings, and the risk classification in order to benefit from RBA after a minimum number of login attempts.
Abschlussbericht zum BMBF-Fördervorhaben Enabling Infrastructure for HPC-Applications (EI-HPC)
(2020)
This work thoroughly investigates a semi-Lagrangian lattice Boltzmann (SLLBM) solver for compressible flows. In contrast to other LBM for compressible flows, the vertices are organized in cells, and interpolation polynomials up to fourth order are used to attain the off-vertex distribution function values. Differing from the recently introduced Particles on Demand (PoD) method , the method operates in a static, non-moving reference frame. Yet the SLLBM in the present formulation grants supersonic flows and exhibits a high degree of Galilean invariance. The SLLBM solver allows for an independent time step size due to the integration along characteristics and for the use of unusual velocity sets, like the D2Q25, which is constructed by the roots of the fifth-order Hermite polynomial. The properties of the present model are shown in diverse example simulations of a two-dimensional Taylor-Green vortex, a Sod shock tube, a two-dimensional Riemann problem and a shock-vortex interaction. It is shown that the cell-based interpolation and the use of Gauss-Lobatto-Chebyshev support points allow for spatially high-order solutions and minimize the mass loss caused by the interpolation. Transformed grids in the shock-vortex interaction show the general applicability to non-uniform grids.
In an effort to assist researchers in choosing basis sets for quantum mechanical modeling of molecules (i.e. balancing calculation cost versus desired accuracy), we present a systematic study on the accuracy of computed conformational relative energies and their geometries in comparison to MP2/CBS and MP2/AV5Z data, respectively. In order to do so, we introduce a new nomenclature to unambiguously indicate how a CBS extrapolation was computed. Nineteen minima and transition states of buta-1,3-diene, propan-2-ol and the water dimer were optimized using forty-five different basis sets. Specifically, this includes one Pople (i.e. 6-31G(d)), eight Dunning (i.e. VXZ and AVXZ, X=2-5), twenty-five Jensen (i.e. pc-n, pcseg-n, aug-pcseg-n, pcSseg-n and aug-pcSseg-n, n=0-4) and nine Karlsruhe (e.g. def2-SV(P), def2-QZVPPD) basis sets. The molecules were chosen to represent both common and electronically diverse molecular systems. In comparison to MP2/CBS relative energies computed using the largest Jensen basis sets (i.e. n=2,3,4), the use of smaller sizes (n=0,1,2 and n=1,2,3) provides results that are within 0.11--0.24 and 0.09-0.16 kcal/mol. To practically guide researchers in their basis set choice, an equation is introduced that ranks basis sets based on a user-defined balance between their accuracy and calculation cost. Furthermore, we explain why the aug-pcseg-2, def2-TZVPPD and def2-TZVP basis sets are very suitable choices to balance speed and accuracy.