forked from trilinos/Trilinos
-
Notifications
You must be signed in to change notification settings - Fork 0
/
RELEASE_NOTES
6994 lines (5162 loc) · 300 KB
/
RELEASE_NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
###############################################################################
# #
# Trilinos Release 14.0 Release Notes #
# #
###############################################################################
CMake
- A switch to modern CMake targets in IMPORTED library targets in installed
`<Package>Config.cmake` files has been made that **breaks backward
compatibility** for some downstream CMake projects by changing how include
directories are handled:
With the switch to modern CMake IMPORTED targets, now include directories
(along with critical compiler options and other modern CMake usage
requirements) are propagated through linking to the IMPORTED library
targets in downstream customer CMake projects. This makes the usage of
the variables `Trilinos_INCLUDE_DIRS` and `Trilinos_TPL_INCLUDE_DIRS`
unnecessary in downstream CMake projects. The variable
`Trilinos_INCLUDE_DIRS` is still set but the variable
`Trilinos_TPL_INCLUDE_DIRS` is empty in this change to modern CMake. To
get include directories for Trilinos packages and TPLs, one must link
against the Trilinos target `Trilinos::all_libs` or
`Trilinos::all_selected_libs` or by linking against the individual package
targets `<packageName>::all_libs` and/or the TPL targets
`<tplName>::all_libs`. To upgrade downstream CMake projects that are
showing missing include directories (i.e. header files can't be found on
the compile liens) just add `target_link_libraries()` calls with the
appropriate `<prefix>::all_libs` targets for the desired package and/or
TPL. (See the section "Using the installed software in downstream CMake
projects" in the updated build reference guide.) However, backward
compatibility is maintained for most customers which are linking all of
their libraries against `${Trilinos_LIBRARIES}`. For these projects, no
changes will need to be made.
However, this change to modern CMake targets will also cause downstream
CMake projects to pull in include directories as `SYSTEM` includes
(e.g. using `-isystem` instead of `-I`) from IMPORTED library targets.
This changes how these include directories are searched and could break
some fragile build environments that have the same header file names in
multiple include directories searched by the compiler. Also, this will
silence any regular compiler warnings from header files found under these
include directories. This constitutes a **break in backward
compatibility** that will break some customer CMake project builds that
use Trilinos on fragile environments where the search order of the include
directories is important.
There are several different approaches for addressing this change from
`-I` to `-isystem` for the Trilinos include directories described below.
**Approach-1:** Update to CMake 3.23 and set the Trilinos configure
variable:
-D Trilinos_IMPORTED_NO_SYSTEM=ON
This will change back the listing of Trilinos include directories in
downstream customer CMake projects from `-isystem` and `-I` and will
therefore restore about perfect backward compatibility.
**Approach-2:** Set the cache variable
`CMAKE_NO_SYSTEM_FROM_IMPORTED=TRUE` in a downstream CMake project to set
which will restore the include directories for the IMPORTED library
targets for the TriBITS project as non-SYSTEM include directories
(i.e. `-I`) but it will also cause all include directories for all
IMPORTED library targets to be non-SYSTEM (i.e. `-I`) even if they were
being handled as SYSTEM include directories before. Therefore, that could
still break the downstream project as it might change what header files
are found for these other IMPORTED library targets and may expose many new
warnings (which may have been silenced by their include directories being
pulled in using `-isystem`).
**Approach-3:** Clean up the list of include directories that is searched
by the compiler so that only the correct header files can be found
(regardless of the search order).
**Approach-4:** Delete some header files in the set of searched include
directories so that only the correct header files can be found (regardless
of the search order).
**Approach-5:** Use other approaches more specific to the given customer
project. For example, if multiple different versions of the googletest
(gtest) library header files can be found in the list of include
directories, then a customer project can build its own version of
googletest and put it's include directories first on the compile line and
list them with `-I` to ensure that only that version will be found, no
matter how many other versions of gtest are installed on the system. For
an example of how this was done for one customer CMake project, see
https://github.com/trilinos/Trilinos/issues/8001#issuecomment-1032827124.
For more details on this change in the handling of include directories
with the switch to modern CMake, see
https://github.com/TriBITSPub/TriBITS/issues/443.
###############################################################################
# #
# Trilinos Release 12.12 Release Notes #
# #
###############################################################################
Overview:
The Trilinos Project is an effort to develop algorithms and enabling
technologies within an object-oriented software framework for the solution of
large-scale, complex multi-physics engineering and scientific problems.
Packages:
The Trilinos 12.12 general release contains 58 packages: Amesos, Amesos2,
Anasazi, AztecOO, Belos, CTrilinos, Didasko, Domi, Epetra, EpetraExt, FEI,
ForTrilinos, Galeri, GlobiPack, Ifpack, Ifpack2, Intrepid, Isorropia, Kokkos,
Komplex, LOCA, Mesquite, ML, Moertel, MOOCHO, MueLu, NOX, Optika, OptiPack,
Pamgen, Phalanx, Pike, Piro, Pliris, PyTrilinos, ROL, RTOp, Rythmos, Sacado,
SEACAS, Shards, ShyLU, STK, Stokhos, Stratimikos, Sundance, Teko, Tempus*,
Teuchos, ThreadPool, Thyra, Tpetra, TriKota, TrilinosCouplings, Trios,
Triutils, Xpetra, Zoltan, Zoltan2.
(* denotes package is being released externally as a part of Trilinos for the
first time.)
Domi
- Enhancements
- Added more sophisticated processor decomposition algorithm. If the
decomposition of the processors along each axis is specified
incompletely for two or more axes, Domi now returns a much more
logical decomposition
- Bug fixes
- Fixed memory management error related to Tuple and ArrayView
- Update ParameterList documentation so that they display properly
in HTML documentation
- Fixed bug in MDMap::getAugmentedMDMap() method
Ifpack2
- Multithreaded Gauss-Seidel now builds by default (#288)
Ifpack2's support for multithreaded Gauss-Seidel uses a new
thread-parallel graph coloring algorithm implemented in KokkosKernels.
(KokkosKernels currently lives in tpetra/kernels.) Ifpack2 makes this
algorithm available in Ifpack2::Relaxation. In the 12.10 release,
users had to set CMake configuration options to nondefault values in
order to enable this code. Now, this code builds and is available by
default. (This fixes GitHub Issue #288.) It is still possible to
disable building this code, by setting the following CMake option to
OFF:
- Ifpack2_ENABLE_Experimental_KokkosKernels_Features
Isorropia
- Removed experimental Tpetra interface. The macro that would have
enabled it was commented out, so it could never have built. See
discussion in #1406: https://github.com/trilinos/Trilinos/issues/1406
PyTrilinos
- General
- PyTrilinos now works with both Python versions 2 and 3
- Internally, PyTrilinos now uses relative imports
- Protect against Doxygen version 1.8.13
- Teuchos
- Fix ParameterList __cmp__() operator
- Improved memory management for wrapped version of sublists
- Fixed a memory leak in the definitions for certain directorin
typemaps
- Epetra
- Add AsMap() method to Epetra.BlockMap class
- ML
- Fixed a dangling reference error in an ML example script.
- LOCA
- Fixed a memory leak in LOCA example script
- Fixed a memory leak in the wrappers for the
LOCA::Abstract::Iterator::StepStatus enumeration
- Anasazi
- Fixed a memory leak that PyTrilinos introduced with the
Eigensolution<...>::evecs() and espace() methods
Tempus
Tempus provides a general infrastructure for the time evolution
of solutions to ODEs, PDEs, and DAEs, through a variety of general
integration schemes, and can be used from small systems of
equations (e.g., single ODEs for the time evolution of plasticity
models) to large-scale transient simulations requiring exascale
computing (e.g., flow fields around reentry vehicles and
magneto-hydrodynamics).
- Examples of time-integration methods available are:
- Tempus::StepperForwardEuler "Forward Euler"
- Tempus::StepperBackwardEuler "Backward Euler"
- Tempus::StepperExplicitRK "Explicit Runge-Kutta"
- Tempus::StepperDIRK "Diagonally Implicit Runge-Kutta methods"
- Newmark-β
- Tempus::StepperNewmarkExplicitAForm "Explicit A-form"
- Tempus::StepperNewmarkImplicitAForm "Implicit A-form"
- Tempus::StepperNewmarkImplicitDForm "Implicit D-form"
- Tempus::StepperHHTAlpha "Hilber-Hughes-Taylor (HHT-α)"
- Tempus::StepperIMEX_RK "Implicit/Explicit Runge-Kutta (IMEX-RK) methods"
- Tempus::StepperIMEX_RK_Partition "Partitioned IMEX-RK methods"
###############################################################################
# #
# Trilinos Release 12.10 Release Notes #
# #
###############################################################################
Overview:
The Trilinos Project is an effort to develop algorithms and enabling
technologies within an object-oriented software framework for the solution of
large-scale, complex multi-physics engineering and scientific problems.
Packages:
The Trilinos 12.10 general release contains 58 packages: Amesos, Amesos2,
Anasazi, AztecOO, Belos, CTrilinos, Didasko, Domi, Epetra, EpetraExt, FEI,
ForTrilinos, Galeri, GlobiPack, Ifpack, Ifpack2, Intrepid, Isorropia, Kokkos,
Komplex, LOCA, Mesquite, ML, Moertel, MOOCHO, MueLu, NOX, Optika, OptiPack,
Pamgen, Phalanx, Pike, Piro, Pliris, PyTrilinos, ROL, RTOp, Rythmos, Sacado,
SEACAS, Shards, ShyLU, STK, Stokhos, Stratimikos, Sundance, Teko, Teuchos,
ThreadPool, Thyra, Tpetra, TriKota, TrilinosCouplings, Trios, Triutils, Xpetra,
Zoltan, Zoltan2.
(* denotes package is being released externally as a part of Trilinos for the
first time.)
Ifpack2
- Experimental multithreaded Gauss-Seidel
This uses a new thread-parallel graph coloring algorithm implemented
in KokkosKernels. Ifpack2's support for this lives in
Ifpack2::Relaxation. Currently, building it is disabled by default
(see #288). Enabling it requires enabling the following CMake options
that are currently OFF by default:
- TpetraKernels_ENABLE_Experimental
- Ifpack2_ENABLE_Experimental
- Ifpack2_ENABLE_Experimental_KokkosKernels_Features
- Ifpack2::Krylov is really SUPER deprecated now
It's been deprecated for a LONG time, complete with deprecated
warnings. This didn't suffice for warning some users, so I moved the
class to the new Ifpack2::DeprecatedAndMayDisappearAtAnyTime
namespace.
- Encapsulate local sparse triangular solves in a new class,
LocalSparseTriangularSolver
- Fix various issues
Issues fix include (but are not limited to) #672, #570, #567, #558,
#551, #544, #409, #234, and #64.
Tpetra
- Build time and size improvements (fix #700)
KokkosKernels now only pre-builds the sparse matrix-vector multiply
kernels that Tpetra needs. Also, for integer Scalar types,
KokkosKernels no longer optimizes sparse matrix-vector multiply for
multiple right-hand sides. It does so only for non-integer (e.g.,
floating-point) Scalar types. This reduces build time and size. (See
Github Issue #700.) Furthermore, KokkosKernels now only pre-builds
sparse matrix-vector multiply for the default offset type.
- Removed "using Teuchos::*" declarations from Tpetra_ConfigDefs.hpp
Tpetra no longer imports Teuchos classes like Comm and RCP (among
others) into the Tpetra namespace. This will help us eventually
remove all the Teuchos_*.hpp header file includes from
Tpetra_ConfigDefs.hpp, thus improving build time.
- MultiVector: Add new two-argument randomize(min,max)
- MultiVector: Get rid of old-interface DistObject methods
Tpetra::MultiVector implements the new DistObject interface. Thus, it
no longer needs to provide implementations for the following three
old-interface DistObject methods:
- createViews
- createViewsNonConst
- releaseViews
- Optimize Map::replaceCommWithSubset for MPI_COMM_SELF (#673)
- Fixed many other issues
Issues fixed include (but are not limited to) #699, #680, #638, #617,
#607, #603, #601, #597, #561, and #46.
###############################################################################
# #
# Trilinos Release 12.8 Release Notes #
# #
###############################################################################
Overview:
The Trilinos Project is an effort to develop algorithms and enabling
technologies within an object-oriented software framework for the solution of
large-scale, complex multi-physics engineering and scientific problems.
Packages:
The Trilinos 12.8 general release contains 58 packages: Amesos, Amesos2,
Anasazi, AztecOO, Belos, CTrilinos, Didasko, Domi, Epetra, EpetraExt, FEI,
ForTrilinos, Galeri, GlobiPack, Ifpack, Ifpack2, Intrepid, Isorropia, Kokkos,
Komplex, LOCA, Mesquite, ML, Moertel, MOOCHO, MueLu, NOX, Optika, OptiPack,
Pamgen, Phalanx, Pike, Piro, Pliris, PyTrilinos, ROL, RTOp, Rythmos, Sacado,
SEACAS, Shards, ShyLU, STK, Stokhos, Stratimikos, Sundance, Teko, Teuchos,
ThreadPool, Thyra, Tpetra, TriKota, TrilinosCouplings, Trios, Triutils, Xpetra,
Zoltan, Zoltan2.
(* denotes package is being released externally as a part of Trilinos for the
first time.)
Domi
- Added replicated boundaries
- A replicated boundary exists only on a periodic domain, and is
simply a convention that the end points are the same points. For
example, a left end coordinate that represents 0 degrees and a
right end coordinate that represents 360 degrees. Domi now
supports either convention, and it affects communication.
- Added additional tests for periodic domains
- Enhancements
- New MDVector constructor that takes a parent MDVector and an array
of Slices
- MDMap support for axis maps
- MDMap getMDComm() method
PyTrilinos
- General
- Improved formatting in example scripts
- Domi
- Update MDMap constructor for replicated boundaries
- Fixed ETI bugs
- NOX/LOCA
- Fixed memory leak by updating NOX typemaps
- Tpetra
- Fix difficult-to-wrap Map class by using %inline
Tpetra
- Stop creating Node instances explicitly!
Hi users! Please don't create Node instances explicitly any more.
Tpetra::Map creates one for you, if you really need one. You really
don't need Node instances: Map's constructors and nonmember
"constructors" don't need them any more, nor do Tpetra's Matrix Market
readers.
Creating Node instances explicitly causes issues with Kokkos
initialization. Node will go away eventually, in favor of Kokkos
execution spaces and memory spaces.
- Lots of bug fixes, especially for CUDA
- Computing offsets in CrsGraph and CrsMatrix is now thread parallel
CrsGraph's and CrsMatrix's fillComplete method computes row offsets,
if they have not yet been computed. This is now thread parallel. It
uses Kokkos::parallel_scan.
- More BlockCrsMatrix kernels are thread parallel
- Interface changes to KokkosSparse::CrsMatrix (the "local" matrix)
- The replaceValues and sumIntoValues methods now take "is_sorted" and
"force_atomic" arguments. These methods now use binary search
(falling back to linear search for short rows) for the sorted case.
- Row views in KokkosSparse::CrsMatrix are no longer templated. They
now use the ordinal type, rather than the offset type, for indexing.
This suffices as long as there are not enough duplicate entries in a
row to exceed ordinal_type. This has the beneficial side effect of
reducing the number of local sparse matrix-vector multiply kernel
instantiations.
- Got rid of LittleBlock and LittleVector (for Block* classes)
Instead, use the little_block_type, const_little_block_type,
little_vec_type, and const_little_vec_type typedefs in BlockCrsMatrix
and other related classes. Underlying data layout has NOT changed
(yet), but constructors HAVE changed. This is technically a
non-backwards-compatible interface change, but all these classes are
in an Experimental namespace anyway.
- Got rid of KokkosClassic::DefaultArithmetic
Stokhos was using this, so we had left it in place in previous
releases for backwards compatibility. Now that no other packages
depend on it, we have gotten rid of it for good. Its functionality
has been replaced by various functions in TpetraKernels.
The original idea behind DefaultArithmetic, as suggested in the name,
was that users could swap out this "default" implementation of
multivector operations with their own implementations. This is
generally less useful than swapping out the implementation of sparse
matrix kernels (like sparse matrix-vector multiply or sparse
triangular solve). As a result, Tpetra never had an implementation
(since at least January 2010) of multivector operations other than
DefaultArithmetic.
ROL
- NEW FEATURES
- Methods
- New phi-divergence capabilities for distributionally-robust
optimization.
- NonlinearLeastSquaresObjective functionality enables the solution of
nonlinear equations through the EqualityConstraint object.
- Infrastructure
- Composite bound constraint (ROL_BoundConstraint_Partitioned).
- Composite equality constraint (ROL_EqualityConstraint_Partitioned)
- Merit function for interior point methods.
- Adapter for Teuchos::SerialDenseVector.
- L1, Lp, Linf norms for interior point methods.
- Allow user-defined bracketing objects.
- Line searches can take user-defined scalar minimizers.
- Ability to supply ScalarMinimizationLineSearch with custom
ScalarFunction.
- New application development and interface tools for PDE-constrained
optimization in PDE-OPT.
- New PDE-OPT examples: stochastic Stefan-Boltzmann, stochastic
advection-diffusion, etc.
- Adaptive sparse grid capabilities with TriKota.
Zoltan
- Improved robustness of RCB partitioner for problems where many objects have
weight = 0 (e.g., PIC codes). Convergence is faster and the stopping
criteria are more robust.
- Fixed bug that occurred when RETURN_LIST=PARTS and (Num_GID > 1 or
Num_LID > 1); GIDs and LIDs are now copied correctly into return lists.
- Fixed a bug related to struct padding in the siMPI serial MPI interface.
Zoltan2
- Graph/Matrix ordering
- Scotch now can be used for graph/matrix ordering.
- The ordering interface Zoltan2::OrderingSolution has been updated
to allow users to access separator info, if it is available.
- Zoltan2::OrderingSolution method getPermutation() is now
getPermutationView().
- Partitioning Metrics
- Partitioning metrics have been moved out of the PartitioningProblem.
They are now accessed through a separate class:
Zoltan2::EvaluatePartition.
- EvaluatePartition accepts as input a
Zoltan2::Adapter and, optionally, a Zoltan2::PartitioningSolution.
Thus, it can be used before or after partitioning, and before or
after migration.
- Imbalance and graph metrics are available.
- Task placement
- A new PartitionMapping class maps parts to processors.
- The MachineRepresentation has been updated, and specializations using
Cray RCA and IBM TopoMgr are provided.
- Geometric task placement using Multijagged partitioning better handles
cases where the machine's network dimension is greater than the
dimension of the coordinates.
- Multijagged partitioning
- Zoltan2's Multijagged partitioner can now partition wrt the longest
coordinate dimension, or in specified x-y-z order.
- TPLs
- Conversions between the index types in TPLs (ParMETIS, Scotch, Zoltan)
are handled more robustly through the TPL_Traits class.
- Interfaces to ParMETIS' AdaptiveRepart and RefineKway algorithms were
added.
- Bugs in the Zoltan interface are fixed.
###############################################################################
# #
# Trilinos Release 12.6 Release Notes #
# #
###############################################################################
Overview:
The Trilinos Project is an effort to develop algorithms and enabling
technologies within an object-oriented software framework for the solution of
large-scale, complex multi-physics engineering and scientific problems.
Packages:
The Trilinos 12.6 general release contains 58 packages: Amesos, Amesos2,
Anasazi, AztecOO, Belos, CTrilinos, Didasko, Domi, Epetra, EpetraExt, FEI,
ForTrilinos, Galeri, GlobiPack, Ifpack, Ifpack2, Intrepid, Isorropia, Kokkos,
Komplex, LOCA, Mesquite, ML, Moertel, MOOCHO, MueLu, NOX, Optika, OptiPack,
Pamgen, Phalanx, Pike, Piro, Pliris, PyTrilinos, ROL, RTOp, Rythmos, Sacado,
SEACAS, Shards, ShyLU, STK, Stokhos, Stratimikos, Sundance, Teko, Teuchos,
ThreadPool, Thyra, Tpetra, TriKota, TrilinosCouplings, Trios, Triutils, Xpetra,
Zoltan, Zoltan2.
(* denotes package is being released externally as a part of Trilinos for the
first time.)
Ifpack2
- Fixed Ifpack2's part of Bug 6358
Ifpack2, as well as its tests and examples, no longer require
GlobalOrdinal = int to be enabled in Tpetra.
- Fixed Bug 6443
One unit test for RILUK had an excessively tight tolerance for Scalar
= float. This commit relaxes the tolerance in a principled way. As a
result, the test now passes.
ROL
- Enhancements:
- Default template parameters for ROL::TpetraMultiVector.
- StdVector, TpetraMultiVector, and PartitionedVector now do
dimensional compatibility checks.
- More unary and binary elementwise functions added.
- New Features:
- Methods:
- InteriorPointStep class, and classes it uses (e.g. PenalizedObjective,
InequalityConstraint, and CompositeConstraint) to solve Interior Point
problems with the CompositeStep SQP solver with successive penalty
reduction.
- Standalone GMRES solver in the Krylov directory.
- Mixed quantile risk measure.
- New risk measure corresponding to Kullback-Liebler based
distributionally robust optimization.
- SROM SampleGenerator: The associated samples are determined
by solving an optimization problem. Current objective functions
correspond to moment matching and the squared L2 error between
distribution functions.
- The PDE-OPT Application Development Kit (ADK) enables the rapid
prototyping of large-scale risk-averse optimization problems with
PDE constraints. The PDE-OPT ADK comprises three key modules:
-- degree-of-freedom manager, which enables the use of an arbitrary
number of simulation, control, and design fields based on finite
element discretizations, on 1D, 2D and 3D meshes;
-- finite element assembly loops and data structures that enable the
development of a variety of multiphysics components, built on
Intrepid for local finite element computations and Tpetra for
parallel linear algebra data structures;
-- interface between the physics module and the SimOpt programming
interface.
- Infrastructure:
- OptimizationProblem class unifies Algorithm::run interface.
- StochasticProblem allows for the construction of a general
stochastic objective function based on input parameters.
- Generalized CVaRVector to RiskVector. This allows for a very general
treatment of risk-averse optimization problems.
- Risk measure factory.
- Default solve implementation for EqualityConstraint_SimOpt.
- BoundConstraint capability for Interior Point problems with and
without Equality Constraints.
Tpetra
- Better CUDA testing
We added more nightly test builds with CUDA enabled (for running on
NVIDIA GPUs). The builds test various combinations of CUDA with
different compiler versions and host thread parallelism options
(OpenMP, Pthreads, serial). CUDA + GCC 4.7.2 is currently the
best-tested option, but we're using these tests to improve support for
other options.
- CrsMatrix, MultiVector, Vector: Added 'atomic' option to sumInto
The sumIntoLocalValues method in Tpetra::CrsMatrix, and the
sumIntoLocalValue and sumIntoGlobalValue methods in
Tpetra::MultiVector and Tpetra::Vector, now take an optional bool
'atomic' argument. If true, the methods use Kokkos::atomic_add
(atomic +=); if false, they use (non-atomic) += as before.
This lets different threads call the methods concurrently on the same
entry/ies of the matrix, multivector, or vector. To support this, I
also modified CrsMatrix::sumIntoGlobalValues so that it does not
change Teuchos::RCP reference counts, thus making it thread safe.
The default value of 'atomic' depends on the class' execution space.
If the execution space is Kokkos::Serial (no threads), atomic is false
by default; else, it is true by default. This ensures that existing
MPI-only codes do not need to pay the (small integer factor) overhead
of atomic updates, while making sumInto always correct by default when
using thread parallelism.
If you know that different threads will never access the same entries
concurrently, you should set atomic=false for best performance.
- Block(Multi)Vector: Add "offset view" constructors (Bug 6450)
Tpetra::Experimental::{BlockMultiVector, BlockVector} now have two
"offset view" constructors. They behave analogously to the offsetView
and offsetViewNonConst methods of Tpetra::MultiVector.
The constructors view an existing BlockMultiVector with a different
mesh Map, and an optional local row offset from which to start the
view on each process. The offset is a mesh offset (it gets multiplied
by the block size internally in order to find the point offset). The
two constructors differ only in that one lets you supply the new point
Map, while the other computes it for you.
This fixes Bug 6450 (which was a feature request).
Zoltan
- Minor code cleanup and bug fixes.
- New Zoltan_Get_Fn interface returning pointers to callback functions.
See zoltan/src/include/zoltan.h for details.
- Closest stand-alone Zoltan release is v3.83.
http://www.cs.sandia.gov/Zoltan
Zoltan2
- New interface to graph partitioning third-party library PuLP (Partitioning
using Label Propagation). PuLP is currently single-node, multi-threaded
with OpenMP. Work for the next release will include extension to
MPI+OpenMP.
- Improved handling of TPL data types, especially in the Zoltan interface.
- Interface from MatrixAdapter to Zoltan hypergraph algorithms implemented.
- Consistent handling of Tpetra explicitly instantiated types; in particular,
enabled builds without GlobalOrdinal=int and without Epetra.
- PartitioningSolutionQuality class is being refactored and renamed
EvaluatePartition; work will continue to next release.
###############################################################################
# #
# Trilinos Release 12.4 Release Notes #
# #
###############################################################################
Overview:
The Trilinos Project is an effort to develop algorithms and enabling
technologies within an object-oriented software framework for the solution of
large-scale, complex multi-physics engineering and scientific problems.
Packages:
The Trilinos 12.4 general release contains 58 packages: Amesos, Amesos2,
Anasazi, AztecOO, Belos, CTrilinos, Didasko, Domi, Epetra, EpetraExt, FEI,
ForTrilinos, Galeri, GlobiPack, Ifpack, Ifpack2, Intrepid, Isorropia, Kokkos,
Komplex, LOCA, Mesquite, ML, Moertel, MOOCHO, MueLu, NOX, Optika, OptiPack,
Pamgen, Phalanx, Pike, Piro, Pliris, PyTrilinos, ROL, RTOp, Rythmos, Sacado,
SEACAS, Shards, ShyLU, STK, Stokhos, Stratimikos, Sundance, Teko, Teuchos,
ThreadPool, Thyra, Tpetra, TriKota, TrilinosCouplings, Trios, Triutils, Xpetra,
Zoltan, Zoltan2.
(* denotes package is being released externally as a part of Trilinos for the
first time.)
Amesos2
- Added MUMPS 5.0 support
- KLU2 is enabled by default
- Bug fixes
- Superlu_dist multiple version support (up to 4.0)
Domi
- Various bug fixes:
- Fixed a bug where a periodic axis under a serial build would not
update communication padding. The parallel version uses MPI
capabilities to do this, so a serial-only capability had to be
added for this corner case.
- Fixed a test that was doing a dynamic type comparison to be more
robust.
- Fixed some minor bugs in the MDVector getTpetraMultiVector()
method.
- The MDArrayView size() method was made const, as it should be
- Created serial tests
- Domi does not provide much in the way of unique capabilities if it
is compiled in serial. Nevertheless, all tests that run on 1
processor were updated to run under a serial build of Trilinos as
well.
Ifpack2
- Deprecated Ifpack2::Krylov
If you want a Krylov solver, just use Belos. Ifpack2::Factory now
throws with an informative exception message if you ask for "KRYLOV".
Muelu
- MueLu requires certain C++11 features. Minimum compiler versions are
gcc 4.7.2 or icc 13. (See https://trilinos.org/about/cxx11.)
- Kokkos is a required dependency.
- New MatrixAnalysis Factory.
- Deprecated Create[TE]petraPreconditioner interfaces taking
Tpetra::CrsMatrix types as input argument in favor of interfaces
accepting Tpetra::Operator types.
- Numerous improvements to MueMex (MueLu's interface to Matlab).
- Mumps can now be used as a coarse grid solver through Amesos2.
- Unification of MueLu Epetra and Tpetra interfaces through Stratimikos.
PyTrilinos
- General
- Fixed several memory management bugs
- Fixed a pervasive bug where C++ exceptions were caught in the
wrong order. This fix results in Python errors that have much
better messages, making it easier to debug problems that employ
cross-language polymorphism.
- Epetra
- Added new Python-to-C++ converters that convert a wider variety of
Python objects to distributed Epetra vectors (Epetra_MultiVector,
Epetra_Vector, Epetra_IntVector). This includes any Python object
that exports the Distributed Array Protocol, such as Enthought's
DistArray. It also includes NumPy arrays when running in a serial
environment.
- Added __distarray__() method to Epetra vector classes, so that
they now export the Distributed Array Protocol.
- Memory management issues associated with the Epetra.LinearProblem
class and the new Epetra vector converters were fixed.
- Tpetra
- Fixed wrappers for Tpetra MultiVector and Vector classes.
- Added __distarray__() method to Tpetra vector classes, so that
they now export the Distributed Array Protocol.
- Bug fixes related to latest Tpetra upgrade
- LOCA
- Introduced a new LOCA test in which the Chan problem is solved
using a preconditioner.
ROL
- Enhancements
- Hierarchical XML parameter lists. This makes ROL easier to use and
control. Demonstrated in all examples and test. Also created
the tierParameterList function to generate a hierarchical list
from a flat list, in rol/src/zoo/ROL_ParameterListConverters.hpp,
demonstrated in rol/test/parameters.
- Algorithm constructor now takes reference-counted pointers to
Step and StatusTest. There is another constructor that takes a step
name (string) and a parameter list. This makes it easier to initialize
a ROL algorithm, based on default choices of steps and status tests.
- New elementwise functions in ROL::Vector allow application of general
nonlinear unary and binary functions as well as reduce operations.
- Modified ROL::BoundConstraint to work with any vector type for which
Vector::applyUnary, Vector::applyBinary, and Vector::reduce are
implemented.
- Modified default behavior of line search so that when the maximum
number of function evaluations is reached and sufficient decrease has
not been attained, optimization terminates. The previous behavior can
be recovered by setting the parameter "Accept Last Alpha" to true in
the Step->Line Search sublist.
- Added line search parameter "Accept Linesearch Minimizer" to the
Step->Line Search sublist. If this parameter is selected to be true,
the argmin step length will be used if the maximum number of function
evaluations is reached without attaining sufficient decrease.
- Renamed CompositeStepSQP to CompositeStep.
- New Features
- Methods
- Bundle Step, for solving nonsmooth problems; see example/minimax/*.
- Moreau-Yosida Penalty, for solving general NLPs; see
example/burgers-control/example_04.
- Augmented Lagrangian, for solving general NLPs; see
example/burgers-control/example_04.
- Higher Moment Coherent Risk Measure. This method is a new risk measure
for stochastic problems, see example/burgers-control/example_06.
- Buffered Probability of Exceedance. This method is a new capability to
minimize the probability of a stochastic cost function. It is
demonstrated in example/burgers-control/example_06.
- Infrastructure
- In ROL_ScaledStdVector.hpp, added a variant of ROL::StdVector that
supports constant (positive) diagonal scalings in the dot product.
This variant comprises the pair of classes ROL::PrimalScaledStdVector
and ROL::DualScaledStdVector; changed the examples in
example/diode-circuit to use variable scalings through these new
classes.
- Distribution Factory, to enable general sampling for stochastic
problems; demonstrated in example/burgers-control/example_05 through
_07.
- SROMSampler. This method permits the use of optimization-based
sampling for stochastic problem. It is demonstrated in
test/sol/test_04.
- ROL::PartitionedVector, for handling vectors of vectors, e.g., when
using slack variables, see /rol/test/vector/test_04.cpp.
- Bug Fixes
- Removed reset of counters for objective function and gradient evaluations
contained in the AlgorithmState in rol/src/step/ROL_TrustRegionStep.hpp.
- Corrected reading of the constraint tolerance parameter in
ROL::AugmentedLagrangianStep.
Tpetra
- Changed CMake option for setting default Node type
To set the default Node type, use the Tpetra_DefaultNode CMake option.
We support the old KokkosClassic_DefaultNode CMake option for
backwards compatibility.
Tpetra will eventually change from using Node types to using
Kokkos::Device types directly. For now, though, if you wish to set
the default Node type explicitly, you must use one of the following:
- Kokkos::Compat::KokkosCudaWrapperNode (CUDA)
- Kokkos::Compat::KokkosOpenMPWrapperNode (OpenMP)
- Kokkos::Compat::KokkosSerialWrapperNode (Serial (no threads))
- Kokkos::Compat::KokkosThreadsWrapperNode (Pthreads)
Tpetra normally only enables one Node type, so you only need to set
the default Node type if you have enabled more than one Node type.
- Rules for which Node type gets enabled by default
Tpetra only enables one Node type by default, whether or not ETI
(explicit template instantiation) is enabled. Here are the rules for
which Node type gets enabled by default:
1. If you're building with CUDA, Tpetra uses CUDA by default.
2. Otherwise, if you're building with OpenMP, Tpetra uses OpenMP by
default.
3. Otherwise, if Kokkos enables the Serial execution space (if
Kokkos_ENABLE_Serial is ON), Tpetra uses Serial by default.
4. Otherwise, if Kokkos enables the Threads execution space (if
Kokkos_ENABLE_Pthread is ON), Tpetra uses Threads by default.
If you wish to enable other Node types, you may set the following
CMake options. You do NOT need to set any of these options explicitly
if the Node type would be enabled by default anyway.
- Tpetra_INST_CUDA (Kokkos_ENABLE_Cuda must be ON, and Trilinos must
be built with CUDA; ON by default if building with CUDA)
- Tpetra_INST_OPENMP (Kokkos_ENABLE_OpenMP must be ON, and Trilinos
must be built with OpenMP support)
- Tpetra_INST_PTHREAD (Kokkos_ENABLE_Pthread must be ON)
- Tpetra_INST_SERIAL (Kokkos_ENABLE_Serial must be ON)
While it is legal to enable both the OpenMP and Pthreads back-ends in
the same executable, it is a bad idea. Both back-ends spawn their own
worker threads, and those threads will fight over cores.
- Completely removed the "classic" version of Tpetra
You might recall that a while back, we split Tpetra into "classic"
(old) and "Kokkos refactor" (new) versions. As of Trilinos 12.0, the
classic version was no longer supported, but we kept it in place for a
few users. As of this release, we have removed the classic version
completely.
You no longer need to set Tpetra_ENABLE_Kokkos_Refactor to get the new
verson of Tpetra. It is ON (TRUE). If you attempt to set it to OFF
(FALSE), Tpetra's CMake raises an error at configure time. Just
enable Tpetra -- that's all you need to do!
This change affects both the Classic and Core subpackages of Tpetra.
All the "classic" Node types are gone now, along with their associated
computational kernels. Use the Kernels subpackage of Tpetra for local
kernels. (We left KokkosClassic::DefaultArithmetic in place for
Stokhos, but ONLY for Stokhos.) The "classic" versions of Tpetra
classes are also now gone. We have replaced them completely with
their "Kokkos refactor" versions.
You might have noticed that Doxygen had a hard time generating
documentation for the classes which had "classic" and "refactor"
versions. These changes should fix that. Furthermore, it's easier to
find header files for classes. In particular, most of the header
files in the tpetra/core/src/kokkos_refactor directory now just have
trivial definitions and only remain for backwards compatibility.
- Improved build times and fewer .cpp files in source directory
Tpetra does a better job now of splitting up explicit instantiations
into separate .cpp files. In some cases, it uses CMake to generate
those .cpp files automatically. This means fewer .cpp files in
tpetra/core/src, so it's easier to find what you want.
- 128-bit floating-point arithmetic through Scalar = __float128
__float128 is a GCC language extension to C(++) that implements
"double-double" 128-bit floating-point arithmetic. It requires
linking with libquadmath, which comes with GCC.
You must use GCC in order to try this feature. Also, set the
following CMake variables:
Tpetra_INST_FLOAT128:BOOL=ON
CMAKE_CXX_FLAGS:STRING="-std=gnu++11 -fext-numeric-literals"
TPL_ENABLE_quadmath:BOOL=ON
You may also have to tell CMake where to find the libquadmath library
and quadmath.h header file:
quadmath_LIBRARY_DIRS:FILEPATH="${QUADMATH_LIB_DIR}"
quadmath_INCLUDE_DIRS:FILEPATH="${QUADMATH_INC_DIR}"
Here, ${QUADMATH_LIB_DIR} points to the directory containing the
libquadmath library (usually your GCC library directory), and
${QUADMATH_INC_DIR} points to the directory containing its header file
(quadmath.h). For example, if you use a GCC installed in
$HOME/pkg/gcc-5.2.0, you might need to set those variables as follows:
QUADMATH_LIB_DIR=$HOME/pkg/gcc-5.2.0/lib
QUADMATH_INC_DIR=\
$HOME/pkg/gcc-5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0/include
Trilinos likes to set the "-pedantic" flag, which causes warnings for
__float128 literals. The build works regardless, but it would be more
pleasing to your eyes if you could figure out how to shut off the
warnings.
I implemented this because the Kokkos refactor of Tpetra broke QD
support (dd_real and qd_real -- "double-double" and "quad-double,"
128- resp. 256-bit floating-point arithmetic). Applications were
asking for a work-around solution.
Zoltan2
- Template argument for arbitrary global identifiers (zgid_t) has been
removed for greater efficiency in the code as well as greater conformity
with Trilinos.
- BasicUserTypes now has only three template parameters;
the zgid_t template argument has been removed.
- OrderingSolution now has two different template parameters:
<lno_t, gno_t> for <local ordinal type, global ordinal type>
- A new test driver capability has been added to Zoltan2 for more robust
testing and experimentation.
- An interface to the Zoltan partitioners has been added to Zoltan2.
Parameter "algorithm" == "zoltan" invokes Zoltan partitioners; parameters
needed by Zoltan are provided through a parameter sublist called
"zoltan_parameters". Zoltan's geometric and hypergraph methods are