問題描述
我正在重構(gòu)一個將進行大量計算的分析系統(tǒng),我需要一些關(guān)于可能的架構(gòu)設(shè)計的想法,以解決我面臨的數(shù)據(jù)一致性問題.
I am refactoring an Analytic system that will do a lot of calculation, and I need some ideas on possible architectural designs to a data consistency issue I am facing.
當(dāng)前架構(gòu)
我有一個基于隊列的系統(tǒng),其中不同的請求應(yīng)用程序創(chuàng)建最終由工作人員使用的消息.
I have a queue based system, in which different requesting applications create messages that are eventually consumed by workers.
每個請求應(yīng)用"將大型計算分解成較小的部分,這些部分將被發(fā)送到隊列并由工作人員處理.
Each "Requesting App" breaks down a large calculation into smaller pieces that will be sent to the queue and processed by the workers.
當(dāng)所有部分都完成后,原始請求應(yīng)用"將合并結(jié)果.
When all the pieces are finished, the originating "Requesting app" will consolidate the results.
此外,workers 使用來自中央數(shù)據(jù)庫 (SQL Server) 的信息來處理請求(重要:worker 不會更改數(shù)據(jù)庫上的任何數(shù)據(jù),只會使用它).
Also, the workers consume information from a centralized database (SQL Server) in order to process the requests (Important: the workers do not change any data on the database, only consume it).
問題
好的.到現(xiàn)在為止還挺好.當(dāng)我們包含更新數(shù)據(jù)庫信息的 Web 服務(wù)時,就會出現(xiàn)問題.這可能隨時發(fā)生,但至關(guān)重要的是,源自同一個請求應(yīng)用程序"的每個大型計算"都會在數(shù)據(jù)庫中看到相同的數(shù)據(jù).
Ok. So far, so good. The problem arises when we include a web service that updates the information on the database. This can happen at any time, but it is critical that each "large calculation" originated from the same "Requesting App" sees the same data on the database.
例如:
- App A 生成消息 A1 和 A2,將其發(fā)送到隊列
- Worker W1 選擇消息 A1 進行處理.
- Web 服務(wù)器更新數(shù)據(jù)庫,從狀態(tài) S0 更改為 S1.
- Worker W2 拿起消息 A2 進行處理
- App A generates messages A1 and A2, sending it to queue
- Worker W1 picks up message A1 for processing.
- The web server updates the database, changing from state S0 to S1.
- Worker W2 picks up message A2 for processing
我不能讓工作人員 W2 使用數(shù)據(jù)庫的狀態(tài) S1.為了使整個計算保持一致,應(yīng)該使用之前的 S0 狀態(tài).
I just can′t have worker W2 using state S1 of the database. for the whole calculation to be consistent it should use the previous S0 state.
想法
鎖定模式,以防止 Web 服務(wù)器在有工作人員從數(shù)據(jù)庫中使用信息時更改數(shù)據(jù)庫.
A lock pattern to prevent the web server from changing the database while there is a worker consuming information from it.
- 缺點:鎖定可能會持續(xù)很長時間,因為不同請求應(yīng)用程序"的計算可能會重疊(A1、B1、A2、B2、C1、B3 等).
- cons: The lock might be on for a long time, since the calculation form different "Request Apps" might overlap (A1, B1, A2, B2, C1, B3, etc.).
在數(shù)據(jù)庫和工作程序之間創(chuàng)建新層(通過請求應(yīng)用程序控制數(shù)據(jù)庫緩存的服務(wù)器)
Create new layer between the database and the workers (a server that controls db caching by req. app)
- 缺點:添加另一層可能會帶來很大的開銷(也許?),而且工作量很大,因為我將不得不重寫工作人員的持久性(大量代碼).
- cons: Adding another layer might impose significant overhead (maybe?), and it is a lot of work, since I will have to rewrite the persistence of the workers (a lot of code).
我正在等待第二種解決方案,但對它不是很有信心.
I am pending to the second solution, but not very confident about it.
有什么絕妙的主意嗎?我設(shè)計錯了,還是遺漏了什么?
Any brilliant ideas ? Am I designing it wrong, or missing something ?
OBS:
- 這是一個巨大的 2 層遺留系統(tǒng)(在 C# 中),我們正在嘗試以最少的努力演變?yōu)楦呖蓴U展性的解決方案可能.
- 每個工作人員可能在不同的服務(wù)器上運行.
推薦答案
感謝大家的幫助.
因為我認(rèn)為這個問題在其他場景中可能很常見,所以我想分享我們選擇的解決方案.
Since I believe this is problem might be usual in other scenarios, I would like to share the solution we chose.
更徹底地思考這個問題,我明白了它的真正含義.
Thinking more thoroughly about the problem, I understood it for what it really is.
- 我需要對每個作業(yè)進行某種會話控制
- 有一個進程內(nèi)緩存,用作每個作業(yè)的會話控制
現(xiàn)在計算已經(jīng)進化為分布式,我只需要將我的緩存也進化為分布式.
Now the calculation has evolved to be distributed, I just needed to evolve my cache to be distributed as well.
為了做到這一點,我們選擇使用內(nèi)存數(shù)據(jù)庫(哈希值),部署為單獨的服務(wù)器.(在本例中為 Redis).
In order to do that, we chose to use an In-Memory Database (hash-value), deployed as a separate server. (in this case Redis).
現(xiàn)在每次開始工作時,我都會為工作創(chuàng)建一個 ID 并將其傳遞給他們的消息
Now every time I start a job, I create a ID for the job and pass it to their messages
當(dāng)每個工人想從數(shù)據(jù)庫中獲取一些信息時,它會:
When each worker wants some information from the database, it would:
- 在 Redis 中查找數(shù)據(jù)(使用作業(yè) ID)
- 如果數(shù)據(jù)在Redis,使用數(shù)據(jù)
- 如果不是,則從 SQL 加載它,并將其保存在 redis 中(使用作業(yè) ID).
在作業(yè)結(jié)束時,我清除與作業(yè) ID 關(guān)聯(lián)的所有哈希值.
At the end of the job, I clear all hashes associated with the job ID.
這篇關(guān)于分布式分析系統(tǒng)數(shù)據(jù)一致性架構(gòu)設(shè)計的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!