**Problem Statement**

How much computational time does it take to conduct the forward elimination part of the Naïve Gauss Elimination method on a square matrix?

This post is brought to you by

- Holistic Numerical Methods Open Course Ware:
- the textbooks on
- the Massive Open Online Course (MOOCs) available at

### Like this:

Like Loading...

## Author: Autar Kaw

Autar Kaw (http://autarkaw.com) is a Professor of Mechanical Engineering at the University of South Florida. He has been at USF since 1987, the same year in which he received his Ph. D. in Engineering Mechanics from Clemson University. He is a recipient of the 2012 U.S. Professor of the Year Award. With major funding from NSF, he is the principal and managing contributor in developing the multiple award-winning online open courseware for an undergraduate course in Numerical Methods. The OpenCourseWare (nm.MathForCollege.com) annually receives 1,000,000+ page views, 1,000,000+ views of the YouTube audiovisual lectures, and 150,000+ page views at the NumericalMethodsGuy blog. His current research interests include engineering education research methods, adaptive learning, open courseware, massive open online courses, flipped classrooms, and learning strategies. He has written four textbooks and 80 refereed technical papers, and his opinion editorials have appeared in the St. Petersburg Times and Tampa Tribune.
View all posts by Autar Kaw

Thank you for the lectures Dr. Kaw, this brings back great memories.

LikeLike

Nice post, many times you hear the cost is O(n^3) but can easily forget why or what the lower order terms are.

One thing that comes to mind is that this is true as long as the computation is arithmetically intense. That is, the number of floating point operations to memory accesses is large. In the case of Gaussian elimination, this is true and so counting FLOPS is a good measure of cost. However, in a operation such as a sparse matrix-vector product (where you have low arithmetic intensity, 1 fused multiply-add per memory access) it is the memory bandwidth that limits performance.

LikeLike

Nice post, we are often reminded that the cost is O(n^3) but often forget why or what the lower order terms are.

One things that comes to mind is that modeling the FLOPS is a good measure of performance only if the computation has a high arithmetic intensity. That is, the number of floating point operations to memory accesses is high. This is true of Gaussian Elimination. However, in an operation such as a sparse matrix-vector multiply the arithmetic intensity is low (1 fused multiply add per memory access) and the operation becomes bandwidth limited.

LikeLiked by 1 person

Yes, so true Nate. This may be very rudimentary way to introduce students to computational time. Many variables enter into the equation for computational time as you pointed out! Watch soon for computational time for Gauss Jordan method of finding inverse!

LikeLike