How many processor cores can MossWinn use?
Posted: Sat Feb 25, 2017 8:31 pm
With multi-core processors having 4-8 processor cores possibly along with additional hyper-threading feature becoming now more and more common, question may arise concerning the optimum number of cores one needs in order to best exploit the power in MossWinn's parallel computing algorithms. Regarding hyper-threading, we note that what matters to MossWinn is the maximum number of threads the processor can execute simultaneously, as opposed to the actual number of cores. This means, for example, that similarly to the Windows OS, MossWinn sees a 4 Core / 8 Thread (4C/8T) processor as an 8 Core one. So whenever we say "cores" below we actually refer to the number of threads the processor can execute simultaneously.
It should also be considered that - at least in the case of MossWinn - multi-core computing is useful first of all to make slow processes fast or faster, as opposed to make an already fast process even faster. This is because the setup and launch of multiple threads also takes some extra execution time whose investment does not pay off when it is comparable with the execution time of the original process, which may well happen when the latter is an easy task even for a single core.
MossWinn turns to parallel computing in connection with several of its functions in and outside of the FIT menu. Here we treat the case of the FIT menu only. First we consider FIT menu functions assuming that the FIT model fitted to the spectrum/spectra does not include a PHS (paramagnetic hyperfine structure, see http://www.mosswinn.hu/phsd/phsd.htm) theory requiring a numerical spherical integral to be carried out (which latter is inherently designed as being a parallel algorithm).
In such cases MossWinn turns to multi-core computing in the case of the following functions:
Global - This performs fitting via the Evolution Algorithm (EA, see Nuclear Instruments and Methods in Physics Research B 129 (1997) 527., http://dx.doi.org/10.1016/S0168-583X(97)00314-5). From each generation of solutions the algorithm generates 24 offsprings whose fitness needs to be evaluated. This process is carried out via a parallel algorithm. From this point of view the number of offsprings (24) is a quite suitable choice, as the evaluation job is readily distributed over 2, 3, 4, 6, 8 and 12 processor cores. The maximum number of cores MossWinn can use for this purpose is 12, in which case each thread launched handles the evaluation of only 2 offsprings. Note, however, that MossWinn continuously monitors the performance of the Evolution Algorithm, and may decide to use less than the maximum number of available cores should this result in a faster overall execution. At the same time, it will always launch at least 2 threads (each thread handling the evaluation of 12 offsprings).
Fit - This performs fine tuning of fit parameters via a local search algorithm. Parallel computing can be realized in this case when simultaneous fitting of several spectra is performed. In such a case MossWinn can distribute the job of analysis of the simultaneously fitted spectra over several processor cores, each core handling a certain number of spectra. In the last (2016/10/22) release of MossWinn the program can use maximum 12 cores for this purpose, which number will be increased to 16 in the next release. The number of cores utilized is nevertheless also limited by the number of spectra fitted simultaneously (given that each thread launched must handle at least 1 spectrum). In addition, even in this case MossWinn evaluates its single-core and multi-core performance before starting the fit, and may decide to use only a single core for the analysis of multiple spectra, should this result in a faster execution. This typically happens when the calculation of the selected model is fast in comparison with the time needed to setup and launch multiple threads.
StD Cal - This performs calculation of the standard deviation of fit parameters arising due to the statistical noise in the spectra. In most cases this will not take too much time (especially when the recommended method - based on the first derivatives - is used),
but for the case you experience long execution times (which may happen when the number of fit parameters is large, e.g. in the order of 100 or higher) it is useful to know that this procedure can utilize up to 32 processor cores.
When a model including PHS (paramagnetic hyperfine structure, see http://www.mosswinn.hu/phsd/phsd.htm) theory - requiring the calculation of a numerical spherical integral - is fitted, then MossWinn always uses parallel computing for the calculation of the spherical integral. In this procedure MossWinn can use up to 32 processor cores. When Global Search is performed (via EA) to fit such a model, MossWinn launches altogether at least twice as many threads as the number of processor cores (assuming the latter does not exceed 32), which will most probably push processor usage up to 100%. In theory, in such a situation MossWinn could benefit from core numbers higher than 32.
Overall it is recommended to use a 4C/8T or a 6C/12T processor to run MossWinn, considering higher number of cores (such as 8C/16T) especially when paramagnetic hyperfine structure models are to be fitted.
It should also be considered that - at least in the case of MossWinn - multi-core computing is useful first of all to make slow processes fast or faster, as opposed to make an already fast process even faster. This is because the setup and launch of multiple threads also takes some extra execution time whose investment does not pay off when it is comparable with the execution time of the original process, which may well happen when the latter is an easy task even for a single core.
MossWinn turns to parallel computing in connection with several of its functions in and outside of the FIT menu. Here we treat the case of the FIT menu only. First we consider FIT menu functions assuming that the FIT model fitted to the spectrum/spectra does not include a PHS (paramagnetic hyperfine structure, see http://www.mosswinn.hu/phsd/phsd.htm) theory requiring a numerical spherical integral to be carried out (which latter is inherently designed as being a parallel algorithm).
In such cases MossWinn turns to multi-core computing in the case of the following functions:
Global - This performs fitting via the Evolution Algorithm (EA, see Nuclear Instruments and Methods in Physics Research B 129 (1997) 527., http://dx.doi.org/10.1016/S0168-583X(97)00314-5). From each generation of solutions the algorithm generates 24 offsprings whose fitness needs to be evaluated. This process is carried out via a parallel algorithm. From this point of view the number of offsprings (24) is a quite suitable choice, as the evaluation job is readily distributed over 2, 3, 4, 6, 8 and 12 processor cores. The maximum number of cores MossWinn can use for this purpose is 12, in which case each thread launched handles the evaluation of only 2 offsprings. Note, however, that MossWinn continuously monitors the performance of the Evolution Algorithm, and may decide to use less than the maximum number of available cores should this result in a faster overall execution. At the same time, it will always launch at least 2 threads (each thread handling the evaluation of 12 offsprings).
Fit - This performs fine tuning of fit parameters via a local search algorithm. Parallel computing can be realized in this case when simultaneous fitting of several spectra is performed. In such a case MossWinn can distribute the job of analysis of the simultaneously fitted spectra over several processor cores, each core handling a certain number of spectra. In the last (2016/10/22) release of MossWinn the program can use maximum 12 cores for this purpose, which number will be increased to 16 in the next release. The number of cores utilized is nevertheless also limited by the number of spectra fitted simultaneously (given that each thread launched must handle at least 1 spectrum). In addition, even in this case MossWinn evaluates its single-core and multi-core performance before starting the fit, and may decide to use only a single core for the analysis of multiple spectra, should this result in a faster execution. This typically happens when the calculation of the selected model is fast in comparison with the time needed to setup and launch multiple threads.
StD Cal - This performs calculation of the standard deviation of fit parameters arising due to the statistical noise in the spectra. In most cases this will not take too much time (especially when the recommended method - based on the first derivatives - is used),
but for the case you experience long execution times (which may happen when the number of fit parameters is large, e.g. in the order of 100 or higher) it is useful to know that this procedure can utilize up to 32 processor cores.
When a model including PHS (paramagnetic hyperfine structure, see http://www.mosswinn.hu/phsd/phsd.htm) theory - requiring the calculation of a numerical spherical integral - is fitted, then MossWinn always uses parallel computing for the calculation of the spherical integral. In this procedure MossWinn can use up to 32 processor cores. When Global Search is performed (via EA) to fit such a model, MossWinn launches altogether at least twice as many threads as the number of processor cores (assuming the latter does not exceed 32), which will most probably push processor usage up to 100%. In theory, in such a situation MossWinn could benefit from core numbers higher than 32.
Overall it is recommended to use a 4C/8T or a 6C/12T processor to run MossWinn, considering higher number of cores (such as 8C/16T) especially when paramagnetic hyperfine structure models are to be fitted.