特殊整除分块的优化

整除分块十分naive,但是卡常之后就不清真了。

如果需要计算\[\sum_{i=1}^n\lfloor \frac{n}{i} \rfloor\]
有一个naive的做法就是

for (long long i=1,la; i<=n; i=la+1){
    la=n/(n/i);
    ans+=(n/i)*(la-i+1);
}

但是,这样不仅根号有2的常数,瓶颈上还有3次除法(可优化至2次),如果n是一个较大的数,跑起来很man。
今天突然看到了松1自己的提交,于是兴冲冲地又复习了一下优越的算法。
首先推式子
要求\[\sum_{i=1}^n \sum_{j=1}^n [i*j \leq n] \]
可拆为\[ \sum_{i=1}^{ \lfloor \sqrt {n} \rfloor} \sum_{j=1}^n [i*j \leq n] +\sum_{i= \lfloor \sqrt{n} \rfloor +1}^n \sum_{j=1}^n [i*j \leq n] \]
变换边界条件\[ \sum_{i=1}^{ \lfloor \sqrt {n} \rfloor} \sum_{j=1}^n [i*j \leq n] +\sum_{i= \lfloor \sqrt{n} \rfloor +1}^n \sum_{j=1}^{\lfloor \sqrt{n} \rfloor} [i*j \leq n] \]
在把前后两项变得一样
\[ \sum_{i=1}^{ \lfloor \sqrt {n} \rfloor} \sum_{j=1}^n [i*j \leq n] +\sum_{i=1}^n \sum_{j=1}^{\lfloor \sqrt{n} \rfloor} [i*j \leq n]-\sum_{i=1}^{ \lfloor \sqrt {n} \rfloor} \sum_{j=1}^{\lfloor \sqrt {n} \rfloor}[i*j \leq n]\]
合并一下
\[2* \sum_{i=1}^{\lfloor \sqrt {n} \rfloor} \sum_{j=1}^n [i*j \leq n] - ( \lfloor \sqrt{n} \rfloor)^2 \]
换一种表示
\[2* \sum_{i=1}^{\lfloor \sqrt{n} \rfloor} \lfloor \frac{n}{i} \rfloor -(\lfloor \sqrt{n} \rfloor) ^2 \]
就可以快速计算啦!

%:pragma GCC target("avx")
%:pragma GCC optimize(3)
%:pragma GCC optimize("Ofast")
%:pragma GCC optimize("inline")
%:pragma GCC optimize("-fgcse")
%:pragma GCC optimize("-fgcse-lm")
%:pragma GCC optimize("-fipa-sra")
%:pragma GCC optimize("-ftree-pre")
%:pragma GCC optimize("-ftree-vrp")
%:pragma GCC optimize("-fpeephole2")
%:pragma GCC optimize("-ffast-math")
%:pragma GCC optimize("-fsched-spec")
%:pragma GCC optimize("unroll-loops")
%:pragma GCC optimize("-falign-jumps")
%:pragma GCC optimize("-falign-loops")
%:pragma GCC optimize("-falign-labels")
%:pragma GCC optimize("-fdevirtualize")
%:pragma GCC optimize("-fcaller-saves")
%:pragma GCC optimize("-fcrossjumping")
%:pragma GCC optimize("-fthread-jumps")
%:pragma GCC optimize("-funroll-loops")
%:pragma GCC optimize("-fwhole-program")
%:pragma GCC optimize("-freorder-blocks")
%:pragma GCC optimize("-fschedule-insns")
%:pragma GCC optimize("inline-functions")
%:pragma GCC optimize("-ftree-tail-merge")
%:pragma GCC optimize("-fschedule-insns2")
%:pragma GCC optimize("-fstrict-aliasing")
%:pragma GCC optimize("-fstrict-overflow")
%:pragma GCC optimize("-falign-functions")
%:pragma GCC optimize("-fcse-skip-blocks")
%:pragma GCC optimize("-fcse-follow-jumps")
%:pragma GCC optimize("-fsched-interblock")
%:pragma GCC optimize("-fpartial-inlining")
%:pragma GCC optimize("no-stack-protector")
%:pragma GCC optimize("-freorder-functions")
%:pragma GCC optimize("-findirect-inlining")
%:pragma GCC optimize("-frerun-cse-after-loop")
%:pragma GCC optimize("inline-small-functions")
%:pragma GCC optimize("-finline-small-functions")
%:pragma GCC optimize("-ftree-switch-conversion")
%:pragma GCC optimize("-foptimize-sibling-calls")
%:pragma GCC optimize("-fexpensive-optimizations")
%:pragma GCC optimize("-funsafe-loop-optimizations")
%:pragma GCC optimize("inline-functions-called-once")
%:pragma GCC optimize("-fdelete-null-pointer-checks")
#include <iostream>
#include <cmath>
using namespace std;
typedef unsigned long long ll;
int main(){
    ll n; cin>>n;
    ll ans=0;
    ll p=sqrt(n);
    for (ll i=p; i; --i) ans+=n/i;
    ans=ans*2-p*p;
    cout<<ans<<endl;
}

猜你喜欢

转载自www.cnblogs.com/Yuhuger/p/9940189.html