Writing in assembly would not give you magic increase of speed as due to amount of details (register allocation etc.) you will probably write the most trivial algorithm ever.

Additionally with modern (read - designed after 70-80's) processors assembly will not give you sufficient number of details to know what is going on (that is - on most processors). Modern PU (CPUs and GPUs) are quite complex as far as scheduling instructions go. Knowing basics of assembly (or pseudoassembly) will allow to understand computer architecture books/courses which would provide further knowledge (caches, out-of-order execution, MMU etc.). Usually you don't need to know complex ISA to understand them (MIPS 5 is quite popular IIRC).

Why understand processor? It might give you much more understanding what's going on. Let's say you write matrix multiplication in naive way:

for i from 0 to N for j from 0 to N for k from 0 to N A[i][j] += B[i][k] + C[k][j]

It may be 'good enough' for your purpose (if it is 4x4 matrix it might be compiled to vector instructions anyway). However there are quite important programs when you compile massive arrays - how to optimize them? If you write the code in assembly you might have a few % of improvement (unless you would do as most people do - also in naive way, underutilizing registers, loading/storing to memory constantly and in effect having slower program then in HL language).

However you can reverse tho lines and magically gain performance (why? I leave it as 'homework') - IIRC depending on various factors for large matrices it can be even 10x.

for i from 0 to N for k from 0 to N for j from 0 to N A[i][j] += B[i][k] + C[k][j]

That said - there are working on compilers being able to do it (graphite for gcc and Polly for anything using LLVM). They are even capable of transforming it into (sorry - I'm writing blocking from memory):

for i from 0 to N for K from 0 to N/n for J from 0 to N/n for kk from 0 to n for jj from 0 to n k = K*n + kk j = J*n + jj A[i][j] += B[i][k] + C[k][j]

To summarise - knowing basics of an assembly allows you to dig into various 'details' from processor design which would allow you to write faster programs. It might be good to know differences between RISC/CISC or VLIW/vector processor/SIMD/... architectures. However I would not start with x86 as they tend to be quite complicated (possibly ARM too) - knowing what is a register etc. is IMHO sufficient for start.