"Dead carry" high concurrent flow, barley grab votes technology Nirvana road

Author | Ali Civic test development experts Shu Qin, Jin Ye

Produced | AI technology base camp (ID: rgznai100)

Barley grab votes background

Barley net main lines of business for the concert, concerts, sporting events, theater, trade fairs, parent-child activities, on-site business class ticket, its business covers the production chain from end B, C terminal sales, on-site inspection of the full set of processes change. Barley net a typical class IP project are scarce hot items, such as concerts, sports games, this type of ticket implies time, limit special properties of space, is the need to grab. Barley rush tickets are double the performance of 11 industry, involves a complex scene, many systems, the link is long, rush tickets protection is particularly important.

Barley rush tickets protection has gone through several stages: the first stage: "primitive" stage, protection is not perfect, inadequate facilities; the second stage: "bomb inside" of the stage, some large rush tickets successfully completed; the third stage: "systematic" stage, able to undertake all the big rush tickets; the fourth stage: "normalization" phase, the big rush to experience the optimization and upgrading.

"Primitive" stage

Big grab prone limiting, experience is not smooth and so on, embarrassing!

1. Why is it so

This is still in the primary stage of barley technical team and Ali business, products, technical discussions and parties are focusing stage, grab votes in most large core of the system is still barley IDC room, .net and java technology system go hybrid system , barley original system is mainly host large task to grab votes at this stage.

Why this system, every major rush tickets will be faced with so much pressure and risk? Analysis of the following main reasons:

1) guarantee facilities are not perfect: Barley IDC room hardware, bandwidth limitations are; DB Enterprise Edition uses SQL SERVER, many are single database repository. When dealing with large rush tickets (especially the seat selection to purchase, consume bandwidth and resource consumption), it will face severe challenges;

2) plans / construction to be limiting: the system in high-traffic, high-pressure protection measures to be under construction, such as: current limit and downgrade, causing more than once the system pressure, directly alarm;

3) monitor the operation and maintenance fragmented: locating and solving problems takes a long time.

2. Program and results

This stage also taken some temporary solutions, such as: limiting the minimalist program, some performance optimizations point, modify the application configuration parameters, sorting grab votes plans and so on, but also alleviate some problems, but overall is still not resolved a large grab votes problem.

"Internalization bomb" stage

Many problems in the first phase based products, technology has started large-scale transformation of the system, the new system to direct the reconstruction Ali domain, with the construction of the new system gradually replace the old system, the engine air exchange program.

1. Air engine change

To quickly reconstruct critical systems to the Ali region, established a direct migration, temporary or long-term reconstruction programs and other steps, specific programs are:

1) playing the part of a link within the APP: technological innovation focused on the wireless side, all APP user interface calls Ali went to the entrance to the domain, and then routed to the barley IDC, let Ali arrived in the room to file a lot of traffic;

2) With Ali basis for operation and maintenance capabilities: As the inlet connection into Ali domain, some of the current limit and downgrade of things you can do using the platform, operation and maintenance monitoring also can use eagleeye, maieye and other tools to do the investigation; this process user center, message center and other services begin to reconstruct Ali domain and on-line;

3) plan to grab votes in the initial establishment: the main station platform plans to establish a large-scale plan to grab votes barley, such as: increased product details page tair cache, by tair Kang Zhu Ali domain traffic, reduce the hit barley IDC's request to call ;

2. firmly and unswervingly route

The optimized heat rush ticket item system is normal, but limiting will affect the user experience. Analyze problems that occur during rush tickets, concentrated in a bottleneck within the range of the bombs, machine room, inadequate operation and maintenance experience. So, it gave birth to the third stage, a large rush tickets link shut the whole region into Ali.

"Systematic" stage

For the remaining issues on stage, grab tickets for business and technical processes and systems to do a full upgrade of the system, establish and improve the flow of rush tickets rush tickets protection mechanism. The upgraded barley to undertake large-scale live all grab votes, and the user experience has improved.

1. bomb internalized the full flowering

1) New search on line Ali domain, search response is too large problems are resolved, the impact on bandwidth does not exist;

2) newly elected seats in the Ali region on the line, seat selection to grab votes of a large flow directly hit the Ali region, and the use of asynchronous mechanism similar ConcurrentHashMap balance the barley IDC seating, call volume and cache coherency;

3) The new on-line trading in Ali domain, the core interface to create a transaction number, are placed under a single interface to all the Ali region, a single order after order to synchronize the engine room of barley subsequent compliance services;

2. assurance processes built

Grab for votes based on the business side and the technical side of inconsistent information, or temporary grab for votes inadequate preparation, etc., led by the test side and the business side, the technical approach to grab votes in the process, the participants, all set to grab votes to reach agreement and precipitated the process to grab votes assurance processes and programs related to human operation established SOP protection, optimized for:

1) [construction] process is divided into stages to grab votes before rush tickets, rush tickets, the tickets after the grab: grab tickets before focusing on the declaration by the business side to grab votes, and then confirmed by the technical party is scheduled rehearsal or pressure measurement, according to the business side and calendar

History rush tickets rush tickets judge the level of information to determine the scope of the implementation plan to grab votes and wind control level;

Grab votes in focusing on process monitoring and emergency response; after rush tickets focusing on plans to restore, grab votes report output, and the rush of votes in question re-set; 2) [Process construction] grab tickets to participate in role: product development, testing PR grab votes involving multiple business parties, the main business is to protect people BD, edit, customer service, the main character has to support the development, testing, customer service, public relations and other related personnel;

Grab votes during the relevant actors fulfill their duties to protect grab votes. Deep impression is that this stage every big grab, grab votes in a house associated personnel gathered, after the successful completion of rush tickets, conference rooms came bursts of cheers.

3. plan / preview / volume / Action special

Rush tickets guaranteed every operation needs to clear after the business side project information, pulled from the test led the parties involved in the overall evaluation and implementation coordinator, whether it is ready to grab tickets before or after the re-set to grab votes, each small perform special place.

1) [quality assurance] program preview: the general flow of existing mature project will not arrange rehearsal for new ways to grab votes or grabbed a large, square test simulation will arrange to grab votes. Primarily by the business side, the technical side and each end of the test the students to participate in, advance business exposure, set questions or experience problems;

2) [quality assurance] performance capacity: Technical pulling the whole link recent similar projects latest pressure measurement data, and online real capacity to do assessments, analyzes whether the estimated amount of rush tickets can successfully support, whether there is a performance bottleneck or limiting case, advance notice of the business side; such projects play no recent data to support pressure measurement, test arranged by the side of the big projects grab pressure measurement;

3) [technology] to optimize the implementation plan: All links related to home search, product information, ticketing cloud seat selection, trading orders, ticketing cloud inventory, order service, wireless terminal such as combing big rush mode of pre-plans and emergency plans . Product Details, such as pre-plan: to artists, concerns, marketing downgrade, adjust the user, venues, details of long-cache, 30-minute warm rush tickets before the purchase of services such as BD and tair;

4) [quality assurance] problem Replay: Each time after the completion of rush tickets rush tickets parties during exposure problems, high customer service feedback consulting, online collection of bug, spit bad posts within the network, microblogging or external users micro letter reproduced and other issues, the tested party unity collecting, organizing after-hours parties re-optimization program implemented in action, and action to follow up the progress of the implementation;

4. monitoring the market to grab votes

In addition to the business of customized grab votes monitored items, a summary of the data monitoring the market's rush tickets, rush tickets for each can provide better monitor data support, facilitate business side glance to get the data to grab votes, details are as follows:

"Normalization" phase

After the upgrade stage systematic improvement, we must ask, barley rush tickets will not be a problem, right? We can responsibly say that certainly does not appear again down the situation, but in the near future scarcity of IP ultra-rush tickets because the project is too fire

Explosion, grab the votes are in the minority of people, that occur after the big rush to sell high-priced second-hand phenomenon, not smooth rush tickets lead to grab the tickets and other issues, robbed the ticket intranet small second, external users crazy Tucao. Summary trough point focused on the grab tickets, merchandise details are limiting, single-consuming interaction leads to grab votes, abnormal grab votes

Situation prompts the user unfriendly and so on, to address these issues in addition to optimized product solutions, technical students also set up a lot to grab votes

Special solve the user pain points, these small but beautiful experience extreme optimization, you noticed?

1. escort to the real user

This stage honed on line nearly a year of barley new trading system, the full integration of the new trading platform and share star ring, the core rendering interface under a single interface is based on the ability to achieve the barley star ring extended features; new trading architecture is as follows:

Into the sharing of barley for a lot of benefits for contributions to grab votes in the process there are three main points: 1) [technology] to optimize the ability to rely on a shared basis in addition to multiplexing sharing capabilities, can also refer to grab a large master plan transactions, such as limits flow, system log monitoring, ask

Title investigation platform and so on. Technology Based party fails to pay off a single custom ring star system, designated by the business side off a single long, effective

Reduce inventory malicious occupancy phenomenon.

2) upgrade [Experience] Access Group risk control system: the popular online service to grab votes of three-tier system of prevention and control technology, the actual risk score from barley user, group security MTEE people

Machine identification, customized strategies packages. In rendering the transaction process, orders with different dimensions to intercepting illegal secondary user, so that the real user experience more smooth grab votes, greatly enhance the purchase of real users.

2. Core link optimization model

1) [upgrade] to experience the infinite stream Product Details: see from the past to grab votes, product details page every big grab not only the machine but also to borrow limiting, became a gathering place for the user trough point. Product Details The main achievement of traffic diversification strategy:

a) before the commencement of concurrent requests to reduce the grab strategy, since the hash control short time on the line can be quickly verified quickly, but the effect is not obvious;

b) After the countdown on the interactive user clicks Override automatic refresh to disperse traffic, the effect is obvious c) reduce the physical processes on call, calling only two pages when users click to buy the countdown, countdown to the check page two, the effect is obvious. After the transformation details page with elastic layer flow from page to the huge gap between the flat and then to the larger flow reversal, product listings no longer have to grab votes by machine, and the most popular enough to support the project goal to grab votes, then do not trigger limiting!

2) [upgrade] to experience a single entrance at expanding trade: from the past, look to grab votes, easily trigger limiting single large flow under the deal, user feedback grab votes. The main strategies for trading orders are:

a) amplifying the inlet air control block + + fallback current limiter, the user can let into the real trading, then filtered through a wind abnormality user control, and finally with a downstream fallback limit protection systems to maximize the protection of the user to grab votes Smooth ;

b) do optimization of payment and payment privileges rendering stage, to achieve render payment call relegation grab large, reduce the risk of users being rendered rendering or asynchronous flow control;

3. Performance normalization

Trading link to find out the latest performance level, the performance test team launched normalization project. Monthly regular implementation of the pressure test, in conjunction with the month of each system function release, performance optimization, or upstream and downstream support, assessment of the specific pressure measurement and pressure measurement target scene, update the link after the pressure test is completed dependent status quo, providing the most effective data grab votes .

1) [quality assurance] normalization execution: such as the recent scarcity of IP projects to grab votes each list to refresh the list again, with the re-development and testing of the system pressure test Mogao assessment based on existing traffic.

2) [quality assurance] Pressure Measurement & Automation: Test combed pressure measurement process, precipitation each business line pressure measurement white paper. Extract process more time-consuming step, automated execution, such as the original temporary project to generate the desired setting before half time consuming steps 11-12 pressure measurements and pressure measurements data, now normalized fixed data item and the generated pressure measurement requires only 3 -4 step 3 minutes.

4. automation plan

Prior to grab votes exist to grab votes activities frequently, grab votes no standardized procedures (monitoring, plans, entrance), to support the number of people, organizations rush tickets meetings, labor-intensive and other issues.

1) [] to grab votes technology to optimize the console:

The BD or operations in Former grab votes Start] can set up some plans, [grab counting process] provides a unified view of rush tickets were [real-time monitoring], and have the ability to [human intervention and control], in [grab votes after the end of the previous grab votes] able to provide data for analysis, to help complete the BD buffet to grab votes, to achieve "lights out" specific implementation process is as follows:

2) [technology] Product Details optimization plan:

Details page needs to grab every big artificial downgraded from 6 to 10 plans, each with differences in values, such as the plan to set the execution time, the assessment needs to be done to adjust plans based on personal experience in the settings, manual operation is too cumbersome. Details of the plan is to achieve basic strategy downgrade the original item tair local cache or caches, third-party interfaces complement-dependent abnormal downgrade limiting, optimized to retain only need to manually configure a plan, others have switched to automate.

The normalization of the results

Achieved by normalization of each special guarantee significant results in the past year to support the project nearly a thousand field to grab votes, of which nearly one hundred large grab votes. Which recently popular "super rush" projects, business details / order rendering / orders hit a record peak, the system successfully undertake; popular "big rush" project, privilege codes seat selection and the general election seat, privileges and seat selection peaks were a record a new high, the system successfully undertake;

Epilogue

Barley grab tickets through the "primitive" stage, after the construction process building, technology optimization, security, and other four special stages of the system has been rock solid, to enhance the user experience are also ongoing efforts. Of course, the problem may still exist more or less, the current technical solution may not be optimal, such as project heat intelligent analysis, risk control automatic adjustment, etc., have also been optimized in technical plan.

【end】

◆精彩推荐◆

「AI应用技术大师课」是CSDN发起的“百万人学AI”倡议下的重要组成部分,4月份AI大师课以线上技术峰会的形式推出,来自微软、硅谷TigerGraph、北邮等产学界大咖就图计算+机器学习,语音技术、新基建+AI、AI+医疗等主题展开分享,扫描下方二维码免费报名,限时再送299元「2020AI开发者万人大会」门票一张。推荐阅读

百万人学AI:CSDN重磅共建人工智能技术新生态154万AI开发者用数据告诉你,中国AI如何才能弯道超车?技术大佬的肺腑之言:“不要为了AI而AI”!| 刷新 CTO悼念前端大牛司徒正美业内最大的“空气币”——以太坊?Spark3.0发布了,代码拉过来,打个包,跑起来!你点的每个“在看”,我都认真当成了AI
Released 1375 original articles · won praise 10000 + · views 6.85 million +

Guess you like

Origin blog.csdn.net/dQCFKyQDXYm3F8rB0/article/details/105259336