A Dangerous Maze (II) LightOJ - 1395 (期望dp)

You are in a maze; seeing n doors in front of you in beginning. You can choose any door you like. The probability for choosing a door is equal for all doors.

If you choose the ith door, it can either take you back to the same position where you begun in ximinutes, or can take you out of the maze after xi minutes. If you come back to the same position, you can remember last K doors you have chosen. And when you are about to choose a door, you never choose a door that is already visited by you. Or we can say that you never choose a door that is visited as one of the last K doors. And the probability of choosing any remaining door is equal.

Now you want to find the expected time to get out of the maze.

Input

Input starts with an integer T (≤ 100), denoting the number of test cases.

Each case contains a blank line and two integers n K (1 ≤ n ≤ 100, 0 ≤ K ≤ n). The next line contains n space separated integers. If the ith integer (xi) is positive, you can assume that the ithdoor will take you out of maze after xi minutes. If it's negative, then the ith door will take you back to the beginning position after abs(xi) minutes. You can safely assume that 1 ≤ abs(xi) ≤ 10000.

Output

For each case, print the case number and the expected time to get out of the maze. If it's impossible to get out of the maze, print '-1'. Otherwise print the result. Error less than 10-6 will be ignored.

Sample Input

4

 

2 0

10 10

 

2 0

10 -10

 

3 1

10 -10 -20

 

3 2

10 -10 -20

Sample Output

Case 1: 10

Case 2: 20.000

Case 3: 30.0000000000

Case 4: 25.0000000000

思路:

记住的门一定是负值门,因为正值直接出去了。假设我们当前已经记住了1个门,那么当我选择的时候,相当于只有n-1个门供我选择,同样的,这个时候我需要考虑如果还是选了负值门该如何处理?其实等价于进入下一个阶段,已经记住了2个门.....

但是我们还是需要知道,当已经记住一个门,再次选择到负值门的期望值是多少?

可以证明,不论记住多少个,选择到负门的期望始终是sum2/cnt2

所以直接先处理出记住k个门的情况,在倒退出到0的情况,就是答案。

当 0<= i < cnt2时,有 dp[i]=sum1 / (n-i)+(cnt2-i)*(dp[i+1]+X)/(n-i) (X表示从cnt2个负值门中选择了i个负值门后,再从剩下的负值门里选择一个门的平均值,其实就是sum2/cnt2,即负值门的平均值)

如果k < cnt2,那么dp[k]=sum1 / (n-k)+(cnt2-k)*(dp[k]+X)/(n-k),然后解出dp[k]的值

代码:

#pragma comment(linker, "/STACK:1024000000,1024000000")
#include<iostream>
#include<map>
#include<string>
#include<cstring>
#include<vector>
#include<algorithm>
#include<set>
#include<sstream>
#include<cstdio>
#include<cmath>
using namespace std;
const int maxn=1e3+9;
int n;
double dp[maxn];
int main(int argc, char const *argv[])
{
    #ifndef ONLINE_JUDGE
        freopen("in.txt","r",stdin);
        freopen("out.txt","w",stdout);
    #endif
    int T;
    cin>>T;
    int Case=0;
    while(T--)
    {
        memset(dp,0,sizeof(dp));
        int k;
        scanf("%d%d",&n,&k);
        int cnt1=0,cnt2=0,sum1=0,sum2=0;
        for(int i=1;i<=n;i++)
        {
            int a;
            scanf("%d",&a);
            if(a>0)
            {
                cnt1++;
                sum1+=a;
            }
            else
            {
                cnt2++;
                sum2+=abs(a);
            }
        }
        if(cnt2==n)
        {
            printf("Case %d: -1\n",++Case);
            continue;
        }
        if(k>=cnt2)
        {
            k=cnt2;
            dp[k]=1.0*sum1/cnt1;
        }
        else
        {
            dp[k]=(sum1+(1.0*sum2*(cnt2-k))/cnt2)/(n-cnt2);
            //dp[k]=(sum1/(n-k)+(cnt2-k)*(1.0*sum2/cnt2)*1.0/(n-k))/(1-(cnt2-k)*1.0/(n-k));
        }
        for(int i=k-1;i>=0;i--)
        {
            dp[i]=1.0*sum1/(n-i)+(1.0*(cnt2-i)*(dp[i+1]+1.0*sum2/cnt2))/(n-i);
        }
        printf("Case %d: %.6lf\n",++Case,dp[0]);
    }
    return 0;
}

猜你喜欢

转载自blog.csdn.net/qq_40774175/article/details/81571147